Hello,
Looks like you've got an interesting task here. As I understood you need a nodejs based script which can process csv files in autonomous regime. I have 3+ years of experience with NodeJS and MongoDB, can help you out on this.
Here's how I'm going to build this project:
Since the data (new csv files) are flowing in constantly, I'll use a watcher and an internal queue system (or make it redis based if the amount of files is too large and you need several instances to work on it).
Watcher will check for any new files every X seconds, add it to queue for processing and delete the files so that it isn't re-added to queue on next iteration.
Now the queue will be either Redis or async (a library) based, depends on a scale as I mentioned. There'll be a optimal number of workers assigned to the queue which will process the files content asynchronously.
Processing data would be a bit challenging but I think if you provide some samples, I'll be able to come up with some universal regex patterns that can parse most of the content you need to be organized in a database.
So yeah, I don't see any problems here except for the parsing for which I just need a lot of sample data to come up with a accurate regex patterns.
I have a couple questions regarding the scale of the project and the lookup data. Let's have a chat and discuss this.
Looking forward to work with you!
Best,
Nick.