visit
Another requirement was to have a possibility to choose a specific page on-demand, so the approach to use “previous-next”-like pagination, that based on cursors was prohibited immediately — its mongoose-paginate-v2 and mongoose-aggregate-paginate-v2 libraries.
The oldest, and probably the simplest in usage is mongoose-paginate — it uses simple search queries, limit, sort, and the skip operations. I guess it’s a good variant for simple pagination — just install a plugin, add few lines of code to your endpoint, and that’s all — work is done. It even can use “populate” of mongoose, — something that emulates joins from SQL world. Technically it just makes additional queries to the database, that probably not the way you want. Even more, when you just have a little bit more complicated query, with any data transformation, it will be totally unusable.
I know just one way to normally use it in such cases — first create — technically its pre-saved queries, that MongoDB represents as read-only collections. And just then run pagination using mongoose-paginate over this view. Not bad — you will hide complicated queries under view, but we have a better idea of how to solve this problem.MongoDB Aggregation Framework is here!
You know, I guess, it was really day for the MongoDB community when Aggregation Framework was released. Probably it allows for most of the queries that you can imagine. So, we think about taking mongoose-aggregate-paginate into use.
But the next two things that disappointed us:
What does this plugin need? I mean — what task does it help to solve, that cannot be solved without this plugin, with the same effort. Looks like it just one more additional dependency in your project, because it doesn’t bring any profit, even don’t save your time…
Internal codebase, and general approach to making queries. This library makes TWO calls into a database and waits for responses via Promise.all. First — to fetch query result and second — to calculate the count of total records that query returns, without $filter and $limit stages. It needs this to calculate the number of total pages.
How we can avoid additional queries into the database? The worst thing here that we need to run all aggregation pipeline twice, that can be costly enough in terms of memory and CPU usage. Even more, if collection huge, and documents tend to be few megabytes, it can impact Disc I/O usage, that also a big problem.The good news — Aggregation Framework has a specific stage in its arsenal, that can solve this problem. It’s $facet:
Processes multiple within a single stage on the same set of input documents. Each sub-pipeline has its field in the output document where its results are stored as an array of documents.. Aggregation Pipeline for pagination will have the next shape:
{
$facet: {
<outputField1>: [ <stage1>, <stage2>, ... ],
<outputField2>: [ <stage1>, <stage2>, ... ],
...
}
}
Imagine that you have two collections — Statistic and Drivers. Drivers collection is static enough in thinking of types and amount of fields in different documents. But Statistic is polymorphic, can be changed during time, as a result of business requirements updates.
Also, some drivers could have big statistic documents and history in general. So you cannot make Statistic as subdocument of Driver.So code and MongoDB query will have the next shape:const ridesInfoPaginationPipeline = (filter = {}, skip = 0, limit = 10, sort = {}) => [{
$match: {
...filter,
active: true,
}
},
{
$sort: {
...sort,
createdAt: -1,
}
},
{
$lookup: {
from: 'statistic',
localField: '_id',
foreignField: 'driverId',
as: 'driver',
},
},
{
$unwind: {
path: '$driver',
preserveNullAndEmptyArrays: true,
},
},
{
$project: {
driver: {
$ifNull: [{
$concat: ['$driver.firstName', ' ', '$driver.lastName']
}, 'Technical']
},
entityId: 1,
message: 1,
meta: 1,
createdAt: 1,
},
},
{
$facet: {
total: [{
$count: 'createdAt'
}],
data: [{
$addFields: {
_id: '$_id'
}
}],
},
},
{
$unwind: '$total'
},
{
$project: {
data: {
$slice: ['$data', skip, {
$ifNull: [limit, '$total.createdAt']
}]
},
meta: {
total: '$total.createdAt',
limit: {
$literal: limit
},
page: {
$literal: ((skip / limit) + 1)
},
pages: {
$ceil: {
$divide: ['$total.createdAt', limit]
}
},
},
},
},
];
const executePagination = async () => {
return Statistic.aggregate(ridesInfoPaginationPipeline());
}
As you see, using Aggregation Framework and $facet stage we can:
Regarding the main drawbacks of such an approach, I guess just one is major — higher complicity of development and debug process, along with higher entry threshold. It includes performance troubleshooting, knowledge of a variety of stages, and data modeling approaches.