Using Node.js to process data

Using Node.js to process data

Node.js is used in this project to help with performance optimization, modify the data, static type checking, and debugging.

Here is the link to the 3D tool: https://sternenhimmel-der-menschheit.de/explore

File size optimization

Data for the stars

The stars shown in the 3D tool must be matched to their real-world position (viewed from the Earth). To achieve this, the position and size of the stars are taken from a CSV data file. However, since this dataset was not made specially for this project, it contained a lot of useless information. Node.js helps remove this extra information and transforms it into what the website needs.

The new file contains the star radius (which is based on the magnitude of the star, from the original data set, and scaled based on the minimum and maximum size the stars should have for this project). Based on this new radius, the stars which are too small to be visible are filtered out. Then, the star coordinates are rounded down to the minimum amount of precision they can have, while not showing any visual difference. I also renamed the "xno" property to "hr", since "hr" is more commonly used (which means it’s more helpful as a keyword while searching for specific stars on the internet). This change was done after I spent a significant amount of time looking for a star with the "xno" number, after forgetting that I could also use "hr" as a keyword. The property xrpm and xdpm are also taken out. A name property (which is taken from another data set) is added to the same file.

The original file is 849KB + 190KB (for the file storing the star names) while the new merged file (with ready-to-be-used data, which avoids some calculations on the front end) is 250KB (so less than a quarter of the original data weight). Since the original is still present in the node folder, I can tweak the properties again, or put more data back in, if needed.

Before

xno,sra0,sdec0,is,mag,xrpm,xdpm
1,0.022536563966376783,0.7893978762666023,A1,6.7,-5.817764048288154e-8,-8.726646427703599e-8
2,0.02209295944816156,-0.0087799757648937,G9,6.29,2.1816615003444895e-7,-2.9088820951983507e-7
3,0.023278328898474376,-0.09961466705757638,K0,4.61,-4.3633232138517997e-8,4.3148418171767844e-7

+

hr,name,constellation,bv,ub
1,BD+,BD+,NaN,8
2,,,1.1,1.02
3,33    Psc,Psc,1.04,0.89

After

name,is,r,sra0,sdec0,hr
BD+,A1,0.2,0.023,0.789,1
,G9,0.3,0.022,-0.009,2
33    Psc,K0,0.7,0.023,-0.1,3

Camera targets

Originally, the camera targets (which specify the coordinates the camera should have to see a specific constellation) were stored in one single file. Since the data size increased quite significantly since the beginning of the project, they are now divided into multiple ones for editing: 2 per culture, 1 for the constellations themselves, and another for individual camera targets. Everything is merged after getting processed.

Another advantage of processing files with Node.js is that it gives more flexibility in how files are edited. Before, I made the files in JSON directly. Now I can use TypeScript, which allows me to have static type checking (to see if all the required properties are correctly given) or use variables that I also use on the front end (constellation IDs, colors...). Before, the color was specified individually everywhere, so it would have needed to be updated individually (probably thousands of places by now). Now I just need to update the main color on the front end. Having variables and static checks for the constellation IDs is also a great change since the website will break if there's a typo in them.

Individual camera targets are used for anything different from a constellation (individual star, multiple constellations…). This division into 2 different files allows an additional static check with TypeScript. The constellation ID will be used to decide which one should be highlighted, while the individual camera targets have an additional optional property that allows them to specify multiple IDs for this feature (since sometimes they need to store multiple constellations or star labels). Their coordinates are also rounded down.

Constellations

The biggest gain in file size was made with the constellation meshes data. The original one (divided into various files) is 3708KB in total, while the final one is 565 KB. This is done purely by rounding down the values of the coordinates. Since the tool that generates those coordinates is custom-made, it is also possible to generate those rounded values directly. However, we decided against it, since it is easy to round them down with Node.js, but generating approximate values directly means we would need to redo all constellations one by one in case we want to increase the value’s accuracy.

Before

{
      name: constellationIdsAraber.araberBogen,
      renderOrder: renderOrder.araberBogen,
      position: [
        19.593021392822266, -75.09520721435547, 63.141319274902344,
        33.75429153442383, -16.26592254638672, 92.76890563964844,
        2.2514162063598633, -71.55503845214844, 69.89141082763672,
        33.75429153442383, -16.26592254638672, 92.76890563964844,
        19.339262008666992, -15.232765197753906, 96.97400665283203,
        2.2514162063598633, -71.55503845214844, 69.89141082763672,
      ],
      normal: [
        0.23084895312786102, -0.4814058244228363, 0.8455514311790466,
        0.23084895312786102, -0.4814058244228363, 0.8455514311790466,
        0.23084895312786102, -0.4814058244228363, 0.8455514311790466,
        0.2148023098707199, -0.4754035770893097, 0.8531420826911926,
        0.2148023098707199, -0.4754035770893097, 0.8531420826911926,
        0.2148023098707199, -0.4754035770893097, 0.8531420826911926,
      ],
      uv: [
        0.00800000037997961, 0.01443334948271513, 0.00800000037997961,
        0.9824333786964417, 0.9919999837875366, 0.024433350190520287,
        0.00800000037997961, 0.9824333786964417, 0.9919999837875366,
        0.9824333786964417, 0.9919999837875366, 0.024433350190520287,
      ],
    },

After

{
          "name": "araberBogen",
          "renderOrder": 0,
          "position": [
            19.6, -75.1, 63.1, 33.8, -16.3, 92.8, 2.3, -71.6, 69.9, 33.8, -16.3,
            92.8, 19.3, -15.2, 97, 2.3, -71.6, 69.9
          ],
          "normal": [0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1],
          "uv": [
            0.008, 0.014, 0.008, 0.982, 0.992, 0.024, 0.008, 0.982, 0.992,
            0.982, 0.992, 0.024
          ]
        },

Transforming data

While processing the data, it sometimes gets modified for convenience. For example, when a star label is orange, it will color the star it is linked to in orange (this was implemented to save time since it's almost always necessary to color the star in orange if its label is orange). An extra property makes it possible to specify that the label should not color the star, for more flexibility.

If it is specified that a star should be colored in red and in orange, it will be colored red (because there are a lot fewer red stars in the scene, so if it was added to the list of red stars, it's likely to be correct).

// make the stars that have an orange label, orange
  culturesKeys.forEach((culture) => {
    const sizes = Object.keys(rawStarLabelWithDifferentType[culture])
    sizes.forEach((size: SizeOptions) => {
      rawStarLabelWithDifferentType[culture][size].forEach((e) => {
        if (e.color === colors.orange_rust__stars && e.colorsStar !== false) {
          orangeStars[culture].push(e.hr)
        }
      })
    })
    // If a star is defined in the red array, it should not be orange
    // red has priority over orange
    orangeStars[culture] = orangeStars[culture].filter((e: number) => {
      if (starsRedWithDifferentType[culture]) {
        return starsRedWithDifferentType[culture].includes(e) === false
      }
      return true
    })
  })

The colors in the dataset get transformed from the original variable to values that can be used directly to build the 3D scene.

const colorHighlightThreeJs = new THREE.Color(colors.orange_rust__stars)

  const rawColors = [
    {
      color: {
        r: round(colorHighlightThreeJs.r, 1000),
        g: round(colorHighlightThreeJs.g, 1000),
        b: round(colorHighlightThreeJs.b, 1000),
      },
      data: starsToBeHighlighted,
    },
    { color: { r: 1, g: 0, b: 0 }, data: starsRedWithDifferentType },
  ]

The star size also gets modified to reflect the size of the ones in the 2D concept (it will be bigger or smaller based on its magnitude).

const scaleRadius = scaleLinear([6, -1], [0.35, 2.25])
const radius = Math.round(scaleRadius(+mag) * 10) / 10

The position of the labels is based on the one from the stars, which are identified via their hr number. It was built that way since a lot of labels are used to indicate the name of a star, and there are overall a lot of stars in the scene, which means a lot of choice for the label's placement. Exceptions appeared with time, so it's now possible to place labels in the middle of a group of stars.

if (Array.isArray(e.hr)) {
    const tempStarValues = { sdec0: 0, sra0: 0 }
    e.hr.forEach((hr) => {
        const tempStar = starsParsed.find((star) => +hr === +star.xno)
        tempStarValues.sdec0 += +round(+tempStar.sdec0, 1000)
        tempStarValues.sra0 += +round(+tempStar.sra0, 1000)
    })
    starPosition = {
        sdec0: tempStarValues.sdec0 / e.hr.length,
        sra0: tempStarValues.sra0 / e.hr.length,
        hr: e.hr[0],
    }
}

Debugging

Processing with Node.js also allows for additional testing during the file processing. For the camera targets, it checks whether the vector for the camera position and the camera’s up value are parallel (or close to being parallel). If it is, it console.log a warning. This is needed since if those 2 vectors are parallel, the mathematical formula that is used to move the camera around the tool breaks and the user can not move freely around the screen anymore. If it is close to being parallel, it will also cause a problem, since it is possible to move the camera, but the movement won’t be smooth. Being able to do it with code is very helpful because it’s something very difficult (and time-consuming) to test manually.

import * as THREE from "three"
import { writeFileSync } from "fs"

import cameraTargets from "./data-camera-targets"
import { CameraTargetsCultureProps, CameraTargetIds } from "../src/models"

interface CameraProps {
  position: { x: number; y: number; z: number }
  up: { x: number; y: number; z: number }
  fov: { min: number; max?: number }
}

const difference = (a: number, b: number) => Math.abs(a) - Math.abs(b)

interface CouterPropPropertiesProps {
  quantity: number
  difference: number
}

interface CouterPropProperties {
  same: CouterPropPropertiesProps
  main: CouterPropPropertiesProps
  minor: CouterPropPropertiesProps
  opposite: CouterPropPropertiesProps
}

const incrementDifferenceBetweenDirections = (
  a: number,
  b: number,
  counterProp: CouterPropProperties
) => {
  if ((a > 0 && b > 0) || (a < 0 && b < 0)) {
    // if a and b have the same sign
    counterProp.same.quantity++
    counterProp.same.difference += difference(a, b)
    // shortcut so that we don't have to check same + opposite
    // every time we want the main of minor values
    if (counterProp.same.quantity >= 2) {
      counterProp.main = counterProp.same
      counterProp.minor = counterProp.opposite
    }
  } else {
    // if a and b have opposite values
    // => one is positive and the other is negative
    counterProp.opposite.quantity++
    counterProp.opposite.difference += difference(a, b)
    // shortcut so that we don't have to check same + opposite
    // every time we want the main of minor values
    if (counterProp.opposite.quantity >= 2) {
      counterProp.main = counterProp.opposite
      counterProp.minor = counterProp.same
    }
  }
  return counterProp
}

interface ConsoleLogWarningProps {
  name: CameraTargetIds
  camera: CameraProps
}

const consoleLogWarning = ({ name, camera }: ConsoleLogWarningProps) => {
  if (camera) {
    const normalizedCameraPosition = new THREE.Vector3(
      camera.position.x,
      camera.position.y,
      camera.position.z
    ).normalize()
    const normalizedCameraUp = new THREE.Vector3(
      camera.up.x,
      camera.up.y,
      camera.up.z
    ).normalize()

    const counter = {
      main: { quantity: 0, difference: 0 },
      minor: { quantity: 0, difference: 0 },
      same: { quantity: 0, difference: 0 },
      opposite: { quantity: 0, difference: 0 },
    }

    const directions = ["x", "y", "z"]

    directions.forEach((axis: "x" | "y" | "z") => {
      incrementDifferenceBetweenDirections(
        normalizedCameraPosition[axis],
        normalizedCameraUp[axis],
        counter
      )
    })

    const limitSameDirections = 0.15
    const limitDifferentDirections = 0.01

    // if the directions are the same, or opposite
    // (which is important to know if they're parallel or not)
    // and the difference between the values is small
    if (
      counter.main.quantity === 3 &&
      Math.abs(counter.main.difference) < limitSameDirections
    ) {
      // parallel vectors of the camera.position and the camera.up can cause 
      // the camera drag movement to be disturbed / buggy
      console.log(
        `!WARNING! ${name} parallel vectors can make the trackball controls bug, camera.up: ${camera.up.x}, ${camera.up.y}, ${camera.up.z}`
      )
    }

    // if the directions are not opposite or the same
    // but the difference is really minimal
    else if (
      counter.minor.quantity === 1 &&
      Math.abs(counter.minor.difference) < limitDifferentDirections
    ) {
      console.log(
        `!WARNING! ${name} very small difference between the vectors 
         can make the trackball controls bug`
      )
    }
  } else {
    console.log(`!WARNING! ${name} does not have any camera coordinates`)
  }
}

const main = async () => {
  // INDIVIDUAL CAMERA TARGETS FOR CONSTELLATIONS + INDIVIDUAL STARS IN THE STORY
  cameraTargets.cultures.forEach((culture: CameraTargetsCultureProps) => {
    // make sure the vectors for the camera position and the camera up 
    // are not parallels otherwise it creates a bug and can not navigate
    // correctly with trackball controls anymore
    // https://github.com/mrdoob/three.js/issues/10161
    // if they are likely to cause a bug, a WARNING will be console.logged
    culture.constellations.forEach(({ name, camera }) => {
      consoleLogWarning({ name, camera })
    })
  })

  writeFileSync(
    "../src/content/data/stars/camera-targets/camera-targets.json",
    JSON.stringify(cameraTargets)
  )

  console.log("camera-targets done")
}

export default main

This bug is known in the Three.js community, which was very helpful in solving it.

To come up with a way to test the camera targets, I manually tested multiple ones to find out which ones had a problem. I then compared the ones that had this issue, to find what they had in common. Once the test was created, I tweaked the tolerance settings to make sure that it detected all the ones that I found manually. Once that was put in place, I modified the existing camera targets so that they do not cause issues anymore, while navigating the tool.

While modifying the camera targets, I found out that updating the position vector is not possible, since there's only one available combination that will center the constellation in the screen. However, there are various possibilities, which will give the same result for the up vector (the one that will determine the camera's rotation). It is therefore possible to modify this one to make sure both vectors are not parallel.