Aggregated queries

Explains how to make aggregate queries over data. Samples demostrating some common use cases.

Overview

Aggregated queries enable you to perform aggregation operations on records in entities, such as counting, summing, or averaging.

This feature is implemented using the MongoDB’s aggregation framework, so you’ll notice many similarities. This guide provides all the information you need to write aggregate queries, and you shouldn’t need to refer to MongoDB’s documentation.

In essence, there is a pipeline where the input consists of records in the entity, and you apply various operations to filter, sort, and aggregate the data. Here’s an example of an aggregate query in JavaScript:

var resultSet = sys.data.aggregate('contacts', [
  {match: {'company.isCustomer': true}},
  {group: {by: 'company', totalSkills: 'sum(numberOfSkills)'}}
]);
while (resultSet.hasNext()) {
  var result = resultSet.next();
  // the format of the result is a simple object you can stringify
  log(JSON.stringify(result));
}

In this pipeline, we have two steps:

  1. Select only records associated with a customer.
  2. Group by company and sum the field numberOfSkills, storing it in the field totalSkills. The input of one step is the result of the previous one, which is why it’s called a pipeline.

The result set for the above example will contain records with the following structure:

{
  "company": {
    "id": "57fd2d61e4b0ce322b0c530d",
    "label": "Flashset"
  },
  "totalSkills": 12
}

As you can see, you get a simple JavaScript object (it follows the JSON representation you can see in the REST API object) instead of something like a record object. This is because the structure changes based on the operations in your pipeline, and the result is no longer a record from an entity. You should check each type’s documentation to verify the format used in the JSON representation.

Aggregated queries, like other queries, can be expressed using a query map, a query builder, and the REST API. Below, you’ll find samples for all versions.

Operations

Count

The Count operation allows you to count elements and store the result in a field. This field cannot contain special characters; only numbers and letters are allowed, and it will be the only field in the result.

Here are some samples of how you can use it:

// Counts companies from New York
var resultSet = sys.data.aggregate('companies', [
    {match: {'address.state': 'NY'}, count: 'counter_field'}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Counts companies from New York
var query = sys.data.createQuery('companies')
    query.match().field('address.state').equals('NY'); query.count().counterName('counter_field');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Counts companies from New York
PUT /data/companies/aggregate

{"match": {"address.state": "NY"}, "count": "counter_field"}

When you run the above query, you will get output similar to this:

{"counter_field":2}

Skip

The Skip operation allows you to skip a specified number of elements. This operator is configured with a positive integer, which determines how many results to omit from the beginning.

Here are some samples of how you can use it:

// Retrieves contacts while skipping the first 3 results
var resultSet = sys.data.aggregate('contacts', [
    {skip: 3}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Retrieves contacts while skipping the first 3 results
var query = sys.data.createQuery('contacts')
    query.skip().step(3);
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Retrieves contacts while skipping the first 3 results
PUT /data/companies/aggregate

{"skip": 3}

Limit

The Limit operation allows you to limit the number of elements in your aggregation results. This operator is configured with a positive integer and retrieves the specified number of results.

Here are some samples of how you can use it:

// retrieves first 5 contacts
var resultSet = sys.data.aggregate('contacts', [
    {limit: 5}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// retrieves first 5 contacts
var query = sys.data.createQuery('contacts')
    query.limit().step(5);
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// retrieves first 5 contacts
PUT /data/companies/aggregate

{"limit": 5}

Unwind

The Unwind operation allows you to unwind multi-valued fields. This operator is configured with a field; if this field is an array, it will retrieve the same record with a single value for the selected field for each element in the array.

Here are some samples of how you can use it:

// retrieves companies unwinding services
var resultSet = sys.data.aggregate('companies', [
    {unwind: {fieldPath: 'services', includeEmpty: true}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// retrieves companies unwinding services
var query = sys.data.createQuery('companies')
    query.unwind().path('services').includeEmpty(true);
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// retrieves companies unwinding services
PUT /data/companies/aggregate

{"unwind": {"fieldPath": "services", "includeEmpty": true}}

When you run the above query, you will get output similar to this:

{"id":"57fd2d64e4b0ce322b0c8349","name":"Photolist","services":"SERVICE_A"}
{"id":"57fd2d64e4b0ce322b0c8349","name":"Photolist","services":"SERVICE_C"}
{"id":"57fd2d62e4b0ce322b0c6268","name":"DabZ","services":"SERVICE_A"}
{"id":"57fd2d62e4b0ce322b0c6268","name":"DabZ","services":"SERVICE_B"}
{"id":"57fd2d62e4b0ce322b0c6268","name":"DabZ","services":"SERVICE_D"}
{"id":"57fd2d60e4b0ce322b0c50f0","name":"Mydo","services":null}
{"id":"57fd2d64e4b0ce322b0c7d12","name":"Zazio","services":"SERVICE_C"}
...

Lookup

The Lookup operation allows you to lookup elements from other entities in current records. This operator is equivalent to a SQL left-join using the name of a foreign entity, a foreign field in that entity, and a local field in the current entity (set in the query).

Here are some samples of how you can use it:

// retrieves companies with related contacts
var resultSet = sys.data.aggregate('companies', [
    {lookup: {localFieldPath: 'id', foreignFieldPath: 'company.id', foreignEntityName: 'contacts', 'as': 'relatedContacts'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// retrieves companies with related contacts
var query = sys.data.createQuery('companies')
    query.lookup().localField('id').foreignField('company.id').foreignEntity('contacts').as('relatedContacts');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// retrieves companies with related contacts
PUT /data/companies/aggregate

{"lookup": {"localFieldPath": "id", "foreignFieldPath": "company.id", "foreignEntityName": "contacts", "as": "relatedContacts"}}

When you run the above query, you will get output like this:

{"id":"57fd2d64e4b0ce322b0c8349",
"name":"Photolist",
"relatedContacts":[
  {
    "id": "5506fc44c2eee3b1a702694c",
    "company": {
      "id": "5506fc43c2eee3b1a7026944",
      "label": "Photolist"
    },
    "firstName": "John",
    "lastName": "Doe",
    "email": "john.doe@abcinc.com",
    "phoneNumbers": [
      "3039514211",
      "3039514210"
    ]
  },
  {
    "id": "5506fc44c2eee3b1a702694d",
    "company": {
      "id": "5506fc43c2eee3b1a7026944",
      "label": "Photolist"
    },
    "firstName": "Martin",
    "lastName": "Smith",
    "email": "martin.smith@abcinc.com"
  }]
}
{"id":"57fd2d62e4b0ce322b0c6268","name":"DabZ","relatedContacts":[
  {
    "id": "5506fc44c2eee3b1a702694e",
    "company": {
      "id": "5506fc43c2eee3b1a7026946",
      "label": "DabZ"
    },
    "firstName": "William",
    "lastName": "Brown",
    "email": "william.brown@acme.com",
    "phoneNumbers": [
      "3039514211",
      "3039514210"
    ]}]
  }
{"id":"57fd2d60e4b0ce322b0c50f0","name":"Mydo","relatedContacts":[]}
{"id":"57fd2d64e4b0ce322b0c7d12","name":"Zazio","relatedContacts":[]}
...

Match

The Match operation allows you to filter records in the pipeline. You need to use this operation to select the records you want to use for your aggregation.

This filter works the same way as the ones for regular queries, which means you can pass a query map or query builder (if you are using the JavaScript API). Keep in mind that only filters by fields are supported here while other parameters are ignored.

Here are some samples of matching operations:

// Counts the number of skills for contacts of customers // see how the match operator filters contacts where company is a customer
var resultSet = sys.data.aggregate('contacts', [
    [{match: {'company.isCustomer': true}}, {group: {totalNumberOfSkills: 'sum(numberOfSkills)'}}]
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Counts the number of skills for contacts of customers // see how the match operator filters contacts where company is a customer
var query = sys.data.createQuery('contacts')
    query.match().field('company.isCustomer').equals(true); query.group().accumulate('totalNumberOfSkills').sum('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Counts the number of skills for contacts of customers // see how the match operator filters contacts where company is a customer
PUT /data/companies/aggregate

[{"match": {"company.isCustomer": true}}, {"group": {"totalNumberOfSkills": "sum(numberOfSkills)"}}]
//  the match operator filters contacts by the company name
var resultSet = sys.data.aggregate('contacts', [
    [{match: {'company.isCustomer': true, 'address.state': 'NJ'}}, {group: {totalNumberOfSkills: 'sum(numberOfSkills)'}}]
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
//  the match operator filters contacts by the company name
var query = sys.data.createQuery('contacts')
    query.match().field('company.isCustomer').equals(true).field('address.state').equals('NJ'); query.group().accumulate('totalNumberOfSkills').sum('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
//  the match operator filters contacts by the company name
PUT /data/companies/aggregate

[{"match": {"company.isCustomer": true, "address.state": "NJ"}}, {"group": {"totalNumberOfSkills": "sum(numberOfSkills)"}}]

// Retrieves contacts with a looked up company. The contacts are filtered by the name of the related company
var resultSet = sys.data.aggregate('contacts', [
    [{'lookup': {'localFieldPath': 'company.id', 'foreignFieldPath': 'id', 'foreignEntityName': 'companies', 'as': 'relatedCompanies'}},{match: {'relatedCompany.name': 'ABC'}}]
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Retrieves contacts with a looked up company. The contacts are filtered by the name of the related company
var query = sys.data.createQuery('contacts')
    query.lookup().localField('company.id').foreignField(.id').foreignEntity('companies').as('relatedContacts')\n;query.match().field('relatedCompanies.name').equals('ABC');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Retrieves contacts with a looked up company. The contacts are filtered by the name of the related company
PUT /data/companies/aggregate

[{"lookup": {"localFieldPath": "id", "foreignFieldPath": "company.id", "foreignEntityName": "contacts", "as": "relatedContacts"}},{"match": {"relatedCompany.name": "ABC"}}]

Sort

Allows to change the sorting of records in your aggregation pipeline. This is useful for sorting the final result or to use together with accumulators like first() or last() in the group operator.

Here are some samples of how you can use it:

// finds the contact with more skills per company
var resultSet = sys.data.aggregate('contacts', [
    {sort: {'numberOfSkills': 'desc'}}, {group: {by: 'company', firstName: 'first(firstName)', lastName: 'first(lastName)', skills: 'first(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company
var query = sys.data.createQuery('contacts')
    query.sort().by('numberOfSkills', 'desc'); query.group().by('company').accumulate('firstName').first('firstName').accumulate('lastName').first('lastName').accumulate('skills').first('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company
PUT /data/companies/aggregate

{"sort": {"numberOfSkills": "desc"}}, {"group": {"by": "company", "firstName": "first(firstName)", "lastName": "first(lastName)", "skills": "first(numberOfSkills)"}}

// finds the contact with more skills per company and sorts the result by skills and last name
var resultSet = sys.data.aggregate('contacts', [
    {sort: {'numberOfSkills': 'desc'}}, {group: {by: 'company', firstName: 'first(firstName)', lastName: 'first(lastName)', skills: 'first(numberOfSkills)'}}, {sort: {'skills': 'desc', 'lastName': 'asc'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company and sorts the result by skills and last name
var query = sys.data.createQuery('contacts')
    query.sort().by('numberOfSkills', 'desc'); query.group().by('company').accumulate('firstName').first('firstName').accumulate('lastName').first('lastName').accumulate('skills').first('numberOfSkills'); query.sort().by('skills', 'desc').by('lastName', 'asc');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company and sorts the result by skills and last name
PUT /data/companies/aggregate

{"sort": {"numberOfSkills": "desc"}}, {"group": {"by": "company", "firstName": "first(firstName)", "lastName": "first(lastName)", "skills": "first(numberOfSkills)"}}, {"sort": {"skills": "desc", "lastName": "asc"}}

Project

The Project operation allows you to remove fields from records and reduce memory usage. This is important if you need to process many records but only need a few fields.

Here are some samples of how to use it:

// leaves only number of employees and the sums up
var resultSet = sys.data.aggregate('companies', [
    {project: 'name,numberOfEmployees'}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// leaves only number of employees and the sums up
var query = sys.data.createQuery('companies')
    query.project().field('name').field('numberOfEmployees');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// leaves only number of employees and the sums up
PUT /data/companies/aggregate

{"project": "name,numberOfEmployees"}

When you run the above query, you will get output like this:

{"id":"57fd2d64e4b0ce322b0c8349","name":"Photolist","numberOfEmployees":83}
{"id":"57fd2d62e4b0ce322b0c6268","name":"DabZ","numberOfEmployees":635}
{"id":"57fd2d60e4b0ce322b0c50f0","name":"Mydo","numberOfEmployees":917}
{"id":"57fd2d64e4b0ce322b0c7d12","name":"Zazio","numberOfEmployees":618}
...

You can see that only the fields name and numberOfEmployees were included in the output (and id that is always there).

Group

The Group operation is the most important operation in your aggregation pipeline as it aggregates data. It allows you to group a set of records based on some fields and accumulate the results of those records into a field.

For example:

// calculate the total number of skills per company
var resultSet = sys.data.aggregate('contacts', [
    {project: 'company,numberOfSkills'}, {group: {by: 'company', totalSkills: 'sum(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// calculate the total number of skills per company
var query = sys.data.createQuery('contacts')
    query.project().field('company').field('numberOfSkills'); query.group().by('company').accumulate('totalSkills').sum('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// calculate the total number of skills per company
PUT /data/companies/aggregate

{"project": "company,numberOfSkills"}, {"group": {"by": "company", "totalSkills": "sum(numberOfSkills)"}}

For this query, you will get a result set like this:

{"company":{"id":"57fd2d60e4b0ce322b0c503d","label":"Jabbercube"},"totalSkills":4}
{"company":{"id":"57fd2d63e4b0ce322b0c75b0","label":"Skilith"},"totalSkills":4}
{"company":{"id":"57fd2d61e4b0ce322b0c5dc6","label":"Trupe"},"totalSkills":4}
{"company":{"id":"57fd2d60e4b0ce322b0c4c22","label":"Feedbug"},"totalSkills":8}
{"company":{"id":"57fd2d62e4b0ce322b0c65ce","label":"Realcube"},"totalSkills":8}
{"company":{"id":"57fd2d61e4b0ce322b0c530d","label":"Flashset"},"totalSkills":12}
{"company":{"id":"57fd2d60e4b0ce322b0c51fe","label":"Thoughtworks"},"totalSkills":8}
{"company":{"id":"57fd2d60e4b0ce322b0c4e02","label":"Mynte"},"totalSkills":0}

Here, you can see how data was grouped by the fields indicated in the by option (in this case, the company field), and then you have accumulators defined that are added as fields in the output (in this case, the totalSkills field).

Basically, the by option indicates how records will be grouped. Records that have the same value(s) in the fields listed in this option will belong to the same group. You can put more than one field if you separate the names using commas or call by() multiple times in the query builder.

Then you can define any number of accumulators. Each accumulator will be a field in the output and must use one of the accumulator operations available:

count()

This operation returns the number of records in the group. For example, you can count the number of contacts for each company.

// counts contacts on each company
var resultSet = sys.data.aggregate('contacts', [
    {project: 'company'}, {group: {by: 'company', numberOfContacts: 'count()'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// counts contacts on each company
var query = sys.data.createQuery('contacts')
    query.project().field('company').field('numberOfSkills'); query.group().by('company').accumulate('numberOfContacts').count();
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// counts contacts on each company
PUT /data/companies/aggregate

{"project": "company"}, {"group": {"by": "company", "numberOfContacts": "count()"}}

sum(field)

This operation calculates the sum of the values in a specified field for the records in the group. For example, you can calculate the total number of skills per company.

// calculate the total number of skills per company
var resultSet = sys.data.aggregate('contacts', [
    {project: 'company,numberOfSkills'}, {group: {by: 'company', totalSkills: 'sum(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// calculate the total number of skills per company
var query = sys.data.createQuery('contacts')
    query.project().field('company').field('numberOfSkills'); query.group().by('company').accumulate('totalSkills').sum('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// calculate the total number of skills per company
PUT /data/companies/aggregate

{"project": "company,numberOfSkills"}, {"group": {"by": "company", "totalSkills": "sum(numberOfSkills)"}}

This only works for number fields like integer, money, decimal or percentage.

avg(field)

This operation calculates the average of the values in a specified field for the records in the group. It works for number fields like integer, money, decimal, or percentage.

// calculate the average number of skills per company per contact
var resultSet = sys.data.aggregate('contacts', [
    {project: 'company,numberOfSkills'}, {group: {by: 'company', avgSkills: 'avg(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// calculate the average number of skills per company per contact
var query = sys.data.createQuery('contacts')
    query.project().field('company').field('numberOfSkills'); query.group().by('company').accumulate('avgSkills').avg('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// calculate the average number of skills per company per contact
PUT /data/companies/aggregate

{"project": "company,numberOfSkills"}, {"group": {"by": "company", "avgSkills": "avg(numberOfSkills)"}}

Keep in mind that if the value of the field is null or the field isn’t present at all in the record it won’t count for the average calculation.

first(field)

This operation selects the value of the first record as the value of the output. It is often used in combination with the sort operator to find the first value of a field.

// finds the contact with more skills per company
var resultSet = sys.data.aggregate('contacts', [
    {sort: {'numberOfSkills': 'desc'}}, {group: {by: 'company', firstName: 'first(firstName)', lastName: 'first(lastName)', skills: 'first(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company
var query = sys.data.createQuery('contacts')
    query.sort().by('numberOfSkills', 'desc'); query.group().by('company').accumulate('firstName').first('firstName').accumulate('lastName').first('lastName').accumulate('skills').first('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company
PUT /data/companies/aggregate

{"sort": {"numberOfSkills": "desc"}}, {"group": {"by": "company", "firstName": "first(firstName)", "lastName": "first(lastName)", "skills": "first(numberOfSkills)"}}

last(field)

This operation selects the value of the last record as the value of the output. It is often used in combination with the sort operator to find the last value of a field.

// finds the contact with more skills per company
var resultSet = sys.data.aggregate('contacts', [
    {sort: {'numberOfSkills': 'asc'}}, {group: {by: 'company', firstName: 'last(firstName)', lastName: 'last(lastName)', skills: 'last(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company
var query = sys.data.createQuery('contacts')
    query.sort().by('numberOfSkills', 'asc'); query.group().by('company').accumulate('firstName').last('firstName').accumulate('lastName').last('lastName').accumulate('skills').last('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the contact with more skills per company
PUT /data/companies/aggregate

{"sort": {"numberOfSkills": "asc"}}, {"group": {"by": "company", "firstName": "last(firstName)", "lastName": "last(lastName)", "skills": "last(numberOfSkills)"}}

max(field)

This operation selects the maximum value from all records as the value of the output.

// finds the maximum number of skills per company for one contact
var resultSet = sys.data.aggregate('contacts', [
    {group: {by: 'company', skills: 'max(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the maximum number of skills per company for one contact
var query = sys.data.createQuery('contacts')
    query.group().by('company').accumulate('skills').max('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the maximum number of skills per company for one contact
PUT /data/companies/aggregate

{"group": {"by": "company", "skills": "max(numberOfSkills)"}}

min(field)

This operation selects the minimum value from all records as the value of the output.

// finds the minimum number of skills per company for one contact
var resultSet = sys.data.aggregate('contacts', [
    {group: {by: 'company', skills: 'min(numberOfSkills)'}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the minimum number of skills per company for one contact
var query = sys.data.createQuery('contacts')
    query.group().by('company').accumulate('skills').min('numberOfSkills');
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// finds the minimum number of skills per company for one contact
PUT /data/companies/aggregate

{"group": {"by": "company", "skills": "min(numberOfSkills)"}}

GeoNear

The GeoNear operation outputs documents in order of nearest to farthest from a specified point. You can only as the first stage of a pipeline.

Here there are some options accepted by this operator:

FieldTypeDescription
distanceFieldstringThe output field that contains the calculated distance.
distanceMultipliernumberOptional. The factor to multiply all distances returned by the query.
includeLocsstringOptional. This specifies the output field that identifies the location used to calculate the distance. This option is useful when a location field contains multiple locations.
keystringOptional. Specify the geospatial field to use when calculating the distance. An error will be thrown if you have many location fields and you don’t specify which one to use.
maxDistancenumberOptional. The maximum distance from the center point that the documents can be. Specify this in meters.
minDistancenumberOptional. The minimum distance from the center point that the documents can be. Specify this in meters.
coordinatesobjectThe point for which to find the closest documents. should be a js object specifying longitude and latitude.

GeoNear Samples

Simple aggregate queries:

// Sorts records by proximity and filters them by min and max distance. Provides the calculated distance value in a custom field 
var resultSet = sys.data.aggregate('companies', [
    {geoNear: {coordinates: {longitude:10, latitude:5}, distanceField:'distance', minDistance:5566, maxDistance:9460000}}
]);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Sorts records by proximity and filters them by min and max distance. Provides the calculated distance value in a custom field 
var query = sys.data.createQuery('companies')
    query_.geoNear().coordinates({longitude:10,latitude:05}).distanceField('distance').minDistance(5566).maxDistance(9460000);
var resultSet = sys.data.aggregate(query);
log('total: ' + resultSet.count());
while (resultSet.hasNext()) {
    log(resultSet.next().label());
}
// Sorts records by proximity and filters them by min and max distance. Provides the calculated distance value in a custom field 
PUT /data/companies/aggregate

[{"geoNear": {"coordinates": {"longitude":10, "latitude":5}}, "distanceField": "distance", "minDistance": 5566, "maxDistance":9460000}]

Pagination of records with geo near operator. Instead of skipping values to paginate it is recommended to use the min distance parameter. Here you can find an interesting example:

The following example paginates results by 10.

     var totals = [];
     let minDistance = 0;
     let lastId;
     var resultSet = sys.data.aggregate('companies', [
       {geoNear: {
          coordinates: {longitude:10,latitude:5},
          distanceField:'distance',
          minDistance:minDistance}
       },
       {limit: 10}
      ]);
       while (resultSet.hasNext()) {
       let value = resultSet.next()
           totals.push(value);
           minDistance = value.distance
           lastId = value.id
       }\n

        // second pagination. We set the latest minimum distance and we skip the first value by id

     let minDistance = minDistance;
     var resultSet = sys.data.aggregate('companies', [
       {geoNear: {
          coordinates: {longitude:10,latitude:5},
          distanceField:'distance',
          minDistance:minDistance}
       },
       {match: {id: 'notEquals(lastId)'}},
       {limit: 10}
      ]);
       while (resultSet.hasNext()) {
       let value = resultSet.next()
           totals.push(value);
           minDistance = value.distance
           lastId = value.id
       }\n

Limitations

These aggregation operations allow you to perform various calculations and data manipulations in your aggregated queries. However, it’s essential to keep in mind that aggregated queries work well for small to moderate-sized datasets. If you need to analyze large datasets, it’s recommended to use dedicated data analysis tools and services like Google BigQuery or Amazon Redshift.