If you are doing linear stuff, and working with big data, I don’t see why Julia would be faster than R and python. For that sort of stuff, python and R uses heavily optimized packages implemented in C, and have a great deal more support than Julia for too big for memory operations/etc. Julia’s DataFrames are perfectly fine for what they do, but querying them/etc. is frequently much slower than R. None of these are fundamental issues with Julia, it is just that the investment into julia’s DataFrames is infinitesimal compared to the investment into R or Pandas. Furthermore, since they all end up using compiled C code in the background for that sort of thing, you shouldn’t expect much performance difference for those tasks.
Not to mention that Pandas and dplyr are better in the near future for a data manipulation pipeline. And while https://github.com/jmboehm/RegressionTables.jl is great, the specialized microeconometric packages in R (or stata) give you the exact sorts of tables and output that journals for heavily empirical microeconometric research would expect.
Of course, that all depends on whether all you are doing is a bunch of (linear) microeconometrics, or whether that is a small part of a bigger project where Julia’s benefits would start to shine (e.g. structural estimation, anything nonlinear, differential equations, dynamic programming, anything where you write an algorithm, etc.).
Sure. But (linear) microeconometrics isn’t really learning a language, it is about learning a bunch of packages and enough glue to hold them together. If you are running a bunch of regressions from packages people have already written, and manipulating/cleaning data, there is no reason you need a “serious” programming language.
I think the main issue with the two language problem is about writing algorithms where things are just too slow with higher level languages, so you ended up writing lower-level kernels in C/Fortran. If you are doing linear microeconometrics, then people have already done that work for you and you are basically just stitching together packages.
Hopefully in the long run, and I sympathize with all of your reasons to want to double-down on one language… I would love to use fewer languages. But once you know 2 it is pretty easy to pickup a 3rd (especially if you don’t intend to write algorithms in it!).
Julia is great at many things, but for now I feel there are plenty of places where you are better off using the more specialized tools that have the best packages (R if you are doing heavy econometrics/statistics, python if you are doing webscraping/neural networks/glue code, Julia for anything nonlinear or where you actually write your own algorithms, Stan for bayesian stuff that fits into its framework, dynare for DSGE stuff, etc.)
But, as I said, just take that as one opinion. But the reason I feel the need to state it is that there are a limited number of people and researcher time that can be put towards investing in Julia packages. Any time spent on reproducing things that R or Python already does very well, is maintenance and development resources that may be taken away from things where Julia is the state-of-the-art.