How can I merge multiple text files in a directory using bash while preserving unique lines?

👀 Views: 69 💬 Answers: 1 📅 Created: 2025-05-31

I'm relatively new to this, so bear with me. I've been struggling with this for a few days now and could really use some help. This might be a silly question, but I'm trying to merge multiple text files located in a specific directory into a single output file while ensuring that only unique lines are kept. My current approach uses a simple cat command, but it includes duplicate lines, which is not what I want. Here’s what I've tried so far: ```bash cat /path/to/directory/*.txt > merged.txt ``` This gives me a `merged.txt` file, but it contains a lot of repeated lines. I also attempted to use `sort -u`, thinking it might help, but I'm working with issues with large files and performance. Here’s the command I ran: ```bash cat /path/to/directory/*.txt | sort -u > merged.txt ``` However, this seems to take a long time, and I've noticed that it sometimes fails with the behavior message `bash: argument list too long` when there are too many files in the directory. I’ve read that using `find` could help avoid this scenario, but I’m unsure how to implement it correctly. Could anyone suggest an efficient way to merge these files while keeping only unique lines and avoiding `argument list too long` errors? Also, are there performance considerations I should be aware of, especially with large files? Any help would be greatly appreciated! This is part of a larger CLI tool I'm building. This is part of a larger service I'm building. Has anyone else encountered this? Thanks in advance!