Data Science with JavaScript


2023 Edit: Hal9 is still very much working in the intersection of data and web technologies; however, web technologies is now just one of the many technologies we use to build amazing data experiences in the web. We also embraced Large Language Models and enterprise technologies like data lakes and Kubernetes. More recent updated at hal9.com/news.

Hi there, since the beginning of this year we've been exploring how far we can take Data Science with JavaScript. As part of this journey, we started hal9.ai, an integrated environment to help us be more productive when analyzing data with JavaScript.

We want to ask for your feedback, but more importantly, we want to use this post to share what we've learned so far:

  1. Visualizations: JavaScript is great at visualizing interactive data, this is probably obvious but worth mentioning nonetheless. Some of the highlights here, D3.js is still a great library to perform visualizations; however, D3.js is really low level -- Kinda like TensorFlow, not Keras. We actually went to create our own charting library to combine the flexibility of D3 with the ease-of-use of other libraries like Plotly; just to find out later on that Plot.js got launched as an amazing library that builds on top of D3. So we ended up integrating Plot.js as our recommended charting library.
  2. Transformations: We found out that JavaScript in combination with D3.js has a pretty decent set of data import and transformation functions; however, it comes nowhere near to Pandas or dplyr. After shopping around, we found out about Tidy.js, loved it, and adopted it. The combination of Tidy.js and D3.js and Plot.js is absolutely amazing for visualizations and data wrangling with small datasets, say 10-100K rows. We were very happy with this for a while; however, once you moved away from visualizations into data analysis, we found out 100K rows is quite restrictive, which is also slow when having 1K-10K columns. So we switched gears and started using Arquero.js, a columnar JS library that enabled us to process +1M rows in the browser, decent size for real-world data analysis.
  3. Modeling: We are currently exploring this space so our findings are not final, but let us share what we've found so far. TensorFlow.js is absolutely amazing, it provides a native port from TensorFlow to JavaScript with support for CPU, WebGL, WebAssembly and NodeJS backends. We were able to write an LSTM to do time series prediction, so far so good. However, we started hitting issues when we started to do simpler models, like a linear regression. We tried Regression.js but we found it falls short, so we wrote our own script to handle single-variable regressions using TF.js and gradient descent. It actually worked quite well but exposed a flaw in this approach; basically, there is a lot of work to be done to bring many models into the web. We also found out Arquero.js does not play nicely with TF.js since well, Arquero.js does not support tensors; so we went on to explore Danfo.js, which integrates great with TF.js but we found out it's hard to scale transformations to +1M rows and found other rough edges. Since then, well, we started exploring Pyodide and perhaps contributing to Danfo.js, or perhaps involving more server-side compute with NodeJS, but that will be a story for another time.

So net-net, we are still super excited about exploring Data Science, Data Engineering, Visualization and Artificial Intelligence with JavaScript; but realistically, it is going to take a few years for this to mature.

In the meantime, we think Data Science with JavaScript shines with smaller datasets and interactive visualizations; which we believe Hal9 can help you be productive at. That said, we do believe that motivated JavaScript users can help unblock themselves by adding new functionality and contributing back libraries to NPM or components to our open source project, please do reach out in Hal9's GitHub repo if you wanna lend a hand!

Alright, so call to action? Please head to hal9.ai and give it a shot! We would love to hear where you think this could be useful, what features we are missing, and any feedback you may have.

To keep in touch, please subscribe to our weekly email at news.hal9.ai, contact us at info@hal9.ai, or follow us on Twitter as @hal9.ai

Thanks for reading along!

Hi from Hal9!

Our team publishes updates on Machine Learning Apps, Data Science and Artificial Intelligence. Looking forward to staying in touch.

Read more from Hi from Hal9!

Just launched Pixsso! The Pixsso eInk Frame displays AI-generated art created by "AI artists" built on Hal9 Pixsso is an interactive eInk frame that brings AI-generated art to your space. Unlike regular eInk displays, Pixsso features physical buttons that let you shuffle through different "AI artists", each creating unique visuals every day. The AI system powering Pixsso was built on Hal9, showcasing the platform’s ability to support AI agents and seamless AI deployment. If this resonates...

We've launched a new product, Hal9 Reply! Reply to any message with Hal9 Reply -- Built with Hal9 Hal9 Reply is a chrome extension to reply to any social media message with AI. Supported sites include LinkedIn, X, Gmail, Reddit, Instagram, GitHub, and many other sites. Hal9 Reply was built with Hal9 and can be customized to your particular use case, get time to chat with us about your project by booking a call at hal9.ai, TTYS! -- The Hal9 Team

Hi there! We have a ton of AI resources for you to check out. We hope you find them useful! Reports We’ve written an 8-page report sharing the AI insights we’ve gained over the past year. It covers everything from hype to reality and explores emerging opportunities. State of AI Report by Hal9 Videos We’ve added new videos to our YouTube channel covering topics like ChatGPT at Work for Productivity. Check them out! Productivity with ChatGPT at Work Using ChatGPT with Databases Future of Work...