> This find | xargs mawk | mawk pipeline gets us down to a runtime of about 12 s... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		snaky on March 12, 2019 \| parent \| context \| favorite \| on: Nginx to Be Acquired by F5 Networks > This find \| xargs mawk \| mawk pipeline gets us down to a runtime of about 12 seconds, or about 270MB/sec, which is around 235 times faster than the Hadoop implementation. https://adamdrake.com/command-line-tools-can-be-235x-faster-...

Zariel on March 12, 2019 | [–]

Using hadoop/spark for <2gb of data seems like a terrible idea.

When all you have is a hammer everything starts to look like a nail.

StreamBright on March 12, 2019 | [–]

Well this is great until you need more nodes. :) I am talking about the same scalability while maintaining a much lower ecological and financial footprint.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact