GCP Firestore Performance Issues with Large Dataset Queries in Node.js
I'm experiencing significant performance issues when querying a large dataset in Google Cloud Firestore using the Node.js SDK. My collection has over 500,000 documents, and when I try to perform queries with filters, it takes an unusually long time to return results, often exceeding 10 seconds. Here's a simplified version of the query I'm using: ```javascript const admin = require('firebase-admin'); const db = admin.firestore(); async function getFilteredData() { const snapshot = await db.collection('myCollection') .where('status', '==', 'active') .orderBy('createdAt') .get(); return snapshot.docs.map(doc => doc.data()); } ``` I've ensured that I have proper indexing set up, especially for the `status` and `createdAt` fields. Using the Firestore Indexing feature, I confirmed that both fields are indexed correctly. However, even with this setup, the query still performs poorly. I also monitored the Firestore usage via the GCP console and noted that the Firestore read operations spike significantly during these queries. I've tried using pagination to limit the number of documents retrieved at once, but it hasn't significantly improved performance. Here's an example of how I'm implementing pagination: ```javascript let lastVisible = null; const pageSize = 100; async function getPaginatedData() { let query = db.collection('myCollection') .where('status', '==', 'active') .orderBy('createdAt') .limit(pageSize); if (lastVisible) { query = query.startAfter(lastVisible); } const snapshot = await query.get(); lastVisible = snapshot.docs[snapshot.docs.length - 1]; return snapshot.docs.map(doc => doc.data()); } ``` Moreover, I’ve looked into using Firestore’s `select()` method to retrieve only necessary fields, but it hasn’t made a noticeable difference. Could there be any best practices or optimizations that I'm missing? Is there a recommended approach for handling large datasets in Firestore effectively? Any insights would be greatly appreciated.