It maybe a little while before v2.4.0 releases as I want to test the new JD changes. Before that test happens, I am working on a mini-project to increase the number of known systems from 4500 to over 60,000, which would cover everything within 750 light years of Sol. In the past, I manually entered all the systems and calculated their positions as needed, which took a LONG time. Now I am trying to convert part of the Hipparchus Star Catalog into a format that Aurora can use using a mainly automated process, although some manual work is still required..
That sounds awesome!
Yes, it is fascinating, but I am also learning just how inaccurate some data can be. I have a spreadsheet with 65,000 stars that I exported from Hipparchus and is the basis of the future known star list. I have been trying to parse text fields into usable data. For example, the spectral type is reported in many different ways. The normal way is single letter (there are about 10 options), followed by 0-9 followed by a roman numeral from 1-7, with an optional small case a/b for some supergiant stars. For example G2V or K0III, etc.. In the catalog the majority of the entries in that column seem to be just free text such as sdM4, dM5.5eJ, M1/M2V, A0m..., M3.5Vvar, etc., so I had to write some excel functions to extract what I needed.
There are multiple names for each star. The 'popular' name and then names from many different star catalogues, seven of which (Gliese, Gliese-Jahreiss, Woolley, HR, HD, HIP, Bayer-Flamsteed) are included in the Hipparchus data along with the popular name. Everything has a HIP entry and usually one or two of the others. For example, Proxima Centauri is also known as Alpha Centauri C, V645 Centauri, GJ 551, HIP 70890, CCDM J14396-6050C, LFT 1110, LHS 49, LPM 526, LTT 5721 and NLTT 37460.
Bayer Flamsteed is an old system that created the Greek letter plus constellation, such as Alpha Centauri or Delta Pavonis, and also number/constellation such as 61 Cygni, which they ran out of greek letters. In the HIP data, they give this in the format of 61SigDra, Del Pav, 24Eta Cas, 107 Psc, 37Xi Boo, 40Omi2Eri, etc. with random amounts of spaces. So you sometimes have up to three leading numbers but not always, then a Greek letter shortened to 3 letters, except for the four Greek letters that only have two characters, then sometimes a number in the middle and then a three letter constellation at the end. Another fun parsing task
For this new list I am choosing from the different catalogues in order, taking the first one that has text. I start with the popular name if given, if not I use Bayer Flamsteed, then Gliese, etc. Unfortunately I didn't have that method when I did my original list, so in a lot of cases the new name doesn't match the current name. I have about 4500 stars in the Aurora database but only 3500 that match the new list. Some of that is because I added stars discovered after the catalogue was created and also a lot of brown dwarfs from the WISE catalogue, etc. So I would prefer to add to my current table, not replace it. That means I have to go through the non-matches and figure out why, which is what I am going right now.
Sometimes it is because I used a different name, or the star in the new list doesn't exist in Aurora, but also finding some weird situations like a star that is in the HIP data, but doesn't exist because it was just a light artifact, or stars that were listed as 15 LY, but are actually 2300 LY, etc. Its a slow process. I might ultimately just replace the list and try to add the old data back in, but I still would need to check that I don't end up putting the same star in twice with different names.
Finally, many of the more distant HIP stars will be multi-star systems, but reported as single stars because they can't be separated at that distance. I will need to add some random binaries and trinaries to balance or all the distant systems will be single star.
Anyway - whatever I end up doing, it won't be quick