Data Science with JavaScript


2023 Edit: Hal9 is still very much working in the intersection of data and web technologies; however, web technologies is now just one of the many technologies we use to build amazing data experiences in the web. We also embraced Large Language Models and enterprise technologies like data lakes and Kubernetes. More recent updated at hal9.com/news.

Hi there, since the beginning of this year we've been exploring how far we can take Data Science with JavaScript. As part of this journey, we started hal9.ai, an integrated environment to help us be more productive when analyzing data with JavaScript.

We want to ask for your feedback, but more importantly, we want to use this post to share what we've learned so far:

  1. Visualizations: JavaScript is great at visualizing interactive data, this is probably obvious but worth mentioning nonetheless. Some of the highlights here, D3.js is still a great library to perform visualizations; however, D3.js is really low level -- Kinda like TensorFlow, not Keras. We actually went to create our own charting library to combine the flexibility of D3 with the ease-of-use of other libraries like Plotly; just to find out later on that Plot.js got launched as an amazing library that builds on top of D3. So we ended up integrating Plot.js as our recommended charting library.
  2. Transformations: We found out that JavaScript in combination with D3.js has a pretty decent set of data import and transformation functions; however, it comes nowhere near to Pandas or dplyr. After shopping around, we found out about Tidy.js, loved it, and adopted it. The combination of Tidy.js and D3.js and Plot.js is absolutely amazing for visualizations and data wrangling with small datasets, say 10-100K rows. We were very happy with this for a while; however, once you moved away from visualizations into data analysis, we found out 100K rows is quite restrictive, which is also slow when having 1K-10K columns. So we switched gears and started using Arquero.js, a columnar JS library that enabled us to process +1M rows in the browser, decent size for real-world data analysis.
  3. Modeling: We are currently exploring this space so our findings are not final, but let us share what we've found so far. TensorFlow.js is absolutely amazing, it provides a native port from TensorFlow to JavaScript with support for CPU, WebGL, WebAssembly and NodeJS backends. We were able to write an LSTM to do time series prediction, so far so good. However, we started hitting issues when we started to do simpler models, like a linear regression. We tried Regression.js but we found it falls short, so we wrote our own script to handle single-variable regressions using TF.js and gradient descent. It actually worked quite well but exposed a flaw in this approach; basically, there is a lot of work to be done to bring many models into the web. We also found out Arquero.js does not play nicely with TF.js since well, Arquero.js does not support tensors; so we went on to explore Danfo.js, which integrates great with TF.js but we found out it's hard to scale transformations to +1M rows and found other rough edges. Since then, well, we started exploring Pyodide and perhaps contributing to Danfo.js, or perhaps involving more server-side compute with NodeJS, but that will be a story for another time.

So net-net, we are still super excited about exploring Data Science, Data Engineering, Visualization and Artificial Intelligence with JavaScript; but realistically, it is going to take a few years for this to mature.

In the meantime, we think Data Science with JavaScript shines with smaller datasets and interactive visualizations; which we believe Hal9 can help you be productive at. That said, we do believe that motivated JavaScript users can help unblock themselves by adding new functionality and contributing back libraries to NPM or components to our open source project, please do reach out in Hal9's GitHub repo if you wanna lend a hand!

Alright, so call to action? Please head to hal9.ai and give it a shot! We would love to hear where you think this could be useful, what features we are missing, and any feedback you may have.

To keep in touch, please subscribe to our weekly email at news.hal9.ai, contact us at info@hal9.ai, or follow us on Twitter as @hal9.ai

Thanks for reading along!

Hi from Hal9!

Our team publishes updates on Machine Learning Apps, Data Science and Artificial Intelligence. Looking forward to staying in touch.

Read more from Hi from Hal9!

In the not-so-distant future, Hal9 envisions generative AI operating in space, delivering safe and affordable solutions through extraterrestrial isolation and low-cost energy powered by companies like Lumen Orbit. We believe generative AI in space offers the ideal foundation for developing the next generation of safe, capable, and efficient solutions. This post explores why and how. "Blueprint of solar array in outer space" -- hal9.com/apps/flux Advanced AI workloads are ideal to take...

We are excited to introduce our most comprehensive yet cost-effective plan to hire your very own AI coworker, designed to boost productivity, completely risk-free, starting at $2,000 USD per month. This plan covers all platform fees and customization from our team. While you can still create and customize AI coworkers for free in Hal9, we’re offering this turnkey solution for those who prefer to focus on value with minimal effort. We're eager to share a selection of AI coworker templates that...

Build a website from Slack with Hal9

You can now use ChatGPT, or your own custom AI coworker, directly from Slack -- All you have to do is click "Add to Slack" from hal9.com. Here is a quick demo: Using Hal9 AI Cowers from Slack In addition, we've open sourced our Hal9 coworker. You can look at the code that powers Hal9 from: github.com/hal9ai/hal9 under the apps/ folder. Hal9 is capable of answering question, generating images, analyzing CSVs and even building simple websites! Creating AI Coworkers In Hal9, there are two ways...