Aggregation is an analysis and aggregation of data. An aggregation occurs in one or more stages that process documents. Documents can be filtered, sorted, grouped, and transformed in a pipeline.
Outputs of one stage become inputs for another. Aggregations do not affect the original source data.
db.collection.aggregate([
{ $stage_name: {} },
{ $stage_name: {} }
])
Each stage is a single operation on the data. Common operation stages are:
Each stage has its own syntax to carry out its operations.
Field names prefixed with a dollar sign are called "field paths". It allows us to refer to the value in that field.
For example, the code below grabs the values of thefirst_name
andlast_name
value fields and sets the concatenation of these values todefaultUsername
.
$set: {
defaultUsername: {
$concat: ["$first_name", "$last_name"]
}
}
An aggregation is a collection and summary of data. A stage is a built-in methode that can be completed on data (does not alter it). An aggregation pipeline is a series of stages completed in data in some order.
db.collection.aggregate([
{
$stage1: {
{ expression1 }...
},
$stage2: {
{ expression1 }...
}
}
])
Stage Syntax:
{$match: {
Query Syntax:
{$expr: {
Method Syntax:
db.collection.aggregate([])
These are very common aggregations.
* match airbnb example *
Find all airbnbs with 5 or more bedrooms
{ $match: { "beds": { $gte: 5 } } },
_id
, the field to group by. d. it may also include one or more fields with an accumulator. e. the accumulator specifies how to aggregate the information for each of the groups.$group generic example:
$group:
{
_id: , // group key
: { : }
}
$group airbnb example
Find all airbnbs with 5 or more bedrooms and return the count of each record.
db.listingsAndReviews.aggregate([
{ $match: { "beds": { $gte: 5 } } },
{ $group: { _id: "$beds", count: { $count: {} } } },
{ $sort: { count: -1 } },
])
I had already used sort earlier to make the results more readable.
$sort: as demonstrated in the $group example, the $sort stage sort by some field, such as the count field above.
represents ascending order, while-1
represents descending order.$limit: limits the number of documents that are passed to the next stage.
only takes a positive integerdb.listingsAndReviews.aggregate([
{ $match: { "beds": { $gte: 5 } } },
{ $group: { _id: "$beds", count: { $count: {} } } },
{ $sort: { count: -1 } },
{ $limit: 3 }
])
As you can see, the order of the stages matters. If we were to limit before we sorted, then we would get different results.
: 1
f. set the field to 0 to include, : 0
g. new value specified for new fields and existing fields can be given a new value, :
Notice the differences in the project stages below.
Aggregation #1
Aggregation #2
Aggregation #3
For example, if we wanted to count how many bed options above 5 there are, we could perform the following aggregation.
In this example, we rename beds to _id.
The same pipeline without the $set stage.
Store counts of airbnbs with greater than 5 beds.