CodexBloom - Programming Q&A Platform

How to efficiently handle large dataset pagination in T-SQL with OFFSET-FETCH?

👀 Views: 99 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-13
sql t-sql pagination performance T-SQL

I'm reviewing some code and I've searched everywhere and can't find a clear answer... I'm working on a personal project and I'm sure I'm missing something obvious here, but I'm currently working on a T-SQL query where I need to paginate through a large dataset containing user records. My current approach uses the `OFFSET-FETCH` clause, but I'm working with performance optimization when the offset becomes large, leading to slow response times. Here's the query I have been using: ```sql DECLARE @PageNumber INT = 10; DECLARE @RowsPerPage INT = 50; SELECT * FROM Users ORDER BY UserID OFFSET (@PageNumber - 1) * @RowsPerPage ROWS FETCH NEXT @RowsPerPage ROWS ONLY; ``` While this works for smaller pages, I noticed that as `@PageNumber` increases, the query execution time significantly increases as well. I also tried adding an index on `UserID`, but the performance still doesn't meet our needs. I read about using `ROW_NUMBER()` in combination with a Common Table Expression (CTE) to handle pagination, so I attempted that as follows: ```sql WITH UserCTE AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY UserID) AS RowNum FROM Users ) SELECT * FROM UserCTE WHERE RowNum BETWEEN ((@PageNumber - 1) * @RowsPerPage + 1) AND (@PageNumber * @RowsPerPage); ``` While this works adequately for smaller datasets, the performance degrades as the dataset grows. I also encountered an scenario with `ROW_NUMBER()` where it sometimes returns duplicate row numbers if rows have the same `UserID`. I received the behavior message: "The result set of 'UserCTE' must contain a unique column." Each query runs fine, but the performance hit is evident when there are millions of records. Are there any best practices or alternative strategies for handling pagination efficiently in T-SQL, especially for large datasets? I'm open to suggestions for both improving the current approach and any other design patterns that may be more suitable for this scenario. My development environment is Windows. I'd really appreciate any guidance on this. For context: I'm using T-Sql on Ubuntu. Has anyone else encountered this? The stack includes T-Sql and several other technologies. What am I doing wrong?