This presentation was recorded at GOTO Amsterdam 2024. #GOTOcon #GOTOams
https://gotoams.nl
Roy van Rijn – Experienced Developer & Architect, Robotics Enthusiast & Hobby Mathematician @royvanrijn
ORIGINAL TALK TITLE
How Fast Can You Parse a File with 1 Billion Rows of Weather Data Using Java?
RESOURCES
https://x.com/royvanrijn
https://www.linkedin.com/in/royvanrijn
https://github.com/royvanrijn
https://royvanrijn.com
Links
https://adventofcode.com
https://x.com/gunnarmorling
https://www.morling.dev
ABSTRACT
Last January a challenge was posted online by Gunnar Morling: How fast can you parse a file with 1 billion rows of weather data using Java?
Little did I know this deceivingly simple question would lead me down a path that taught me all about: parallelism, memory mapped files, SWAR techniques (SIMD as a register), bit twiddling, branchless code, mechanical sympathy, Graal native compilation and finally… I even turned to the dark side: using sun.misc.Unsafe.
Join me in this deep dive where I’ll explain all the code changes and tricks that took me from the reference implementation which processes the billion records in 4+ minutes, to processing everything in under 2 seconds.
Who knew Java could be this fast? […]
TIMECODES
00:00 Intro
01:49 The challenge
06:07 Watch, learn, adopt, experiment
08:00 Mechanical sympathy
09:32 Temperature as integer
10:37 Memory mapped files
11:54 Getting unsafe
13:31 SWAR
17:22 Stringless
18:18 Branchless programming
20:35 Parse the temperature
30:14 Keeping track
36:22 Which JVM?
37:21 Graal (native-image)
39:38 Summary
40:50 Results
42:00 Outro
Download slides and read the full abstract here:
https://gotoams.nl/2024/sessions/3164
RECOMMENDED BOOKS
Monica Beckwith • JVM Performance Engineering • https://amzn.to/3zuJ7Ig
Scott Oaks • Java Performance • https://amzn.to/4eNhlH4
Trisha Gee, Kathy Sierra & Bert Bates • Head First Java • https://amzn.to/3k59BJ6
Trisha Gee & Kevlin Henney • 97 Things Every Java Programmer Should Know • https://amzn.to/3kiTwJJ
Tweets by GOTOcon
https://www.linkedin.com/company/goto-
https://www.instagram.com/goto_con
https://www.facebook.com/GOTOConferences
#Java #JVM #GraalVM #Parsing #Parallelism #MemoryMappedFiles #SWAR #BitTwiddling #BranchlessCode #MechanicalSympathy #GraalNative #JavaProgramming #AdventOfCode #1BillionRowChallenge #GunnarMorling #RoyvanRijn
Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket at https://gotopia.tech
Sign up for updates and specials at https://gotopia.tech/newsletter
SUBSCRIBE TO OUR CHANNEL – new videos posted almost daily.
https://www.youtube.com/user/GotoConferences/?sub_confirmation=1
source
Comments
The view count gives testamony what a fun challenge that was.
How can I learn Java, which is this advanced, every course just teaches object oriented programming
When a software engineer stumbles upon the dark arts of real computer science…
The only thing that comes to mind not talked about was AVX2 or SSE4.x (I don't know if they're supported natively in Java).
These tricks are great for graph analytics too!
ok, how about Golang?
This is a brilliant talk and it's been featured in the last issue of Tech Talks Weekly newsletter 🎉
Congrats Roy!
Really Great talk, thanks for sharing this knowledge
If you optimize for specific arches, you can do the SIMD lookup with less instructions and much wider. For example, I’m using the memchr Rust crate by the genius BurntSushi, and specifically the AVX2 implementation. It does loops of 4 sequential comparisons with 256bit registers. The SIMD part is just 2 instructions.
Kudos for mentioning Advent of Code! And yeah, most people can parse 1 Billion rows of weather data between every blink (and more if using strong Java)
Imagine now if we didn't have to deal with the absolute idiot who created that human readable string data format… Completely unrealistic problem, because the 1st billable hour of work would go to make the data persistence computer friendly, not trying to parse strings fast. (ex: [2 bytes cityid] [ 2 bytes temperature], or 4+4 if > 64k cities). Besides, no worthy sw engineer would ever create the problem of mixing data that is naturally partitioned (sensors or cities). It was an embarassingly parallel problem made worse by tossing everything in a single file.
Any language that cannot do this entirely I/O limited by the reading of the billion rows from disk in parallel, should be shamed. This would run at I/O speed in JavaScript, Java, C#, Python, Turbo Pascal,, even LUA could do this. 🙂
I've followed the competition on Twitter and GitHub but this talk is just a Gem in how it is being told. Big up for the presentation/slides skills!
native compilation… is it really java anymore?
In 1975, at the University of Delft, my professor and I collaboratively developed an assembler and interpreter for the computer practicum. We had to run this on an IBM mainframe, using a higher-level language. To make it functional, we had to employ extensive masking and shifting operations. I vividly remember the complex logical intricacies we had to navigate to get it all working correctly. From this hands-on experience, I truly admire your work and the effort it takes to even get it working.
This was so much fun watching
Amazing presentation, thanks a lot for sharing ❤
From 4 minutes 49 seconds 679 milliseconds >>> to >>> 1 second 535 miliseconds?
Wow!
Great talk, thanks!
Wizard level optimizations
even my low brain can understand what you said. great talk thank you Roy van Rijn
Using memory maps is cheating!
This is gold
This was an amazing talk
What a journey. Lovely talk 🙂