Getting started with the Julia language

nevj · May 2, 2025, 11:07am

You can tell Julia to use multiple CPU’s.
For example, Imodified my script

#!/bin/sh
for i in dermis.collagen.images/Manton/smooth/*
  do
    julia -p 4 ./collagen1img.jl $i
  done

the -p 4 option tells it to use 4 processes.
It actually uses 5 processes

and I can see its CPU usage

it goes in fits and starts, and the loadmonitor shows several processes
fourprocesses3

there are 5 threads there ( I have 12 available)

Now, did it run any faster?
Without -p 4 takes about 45 minutes ( by the clock on the wall)
With -p 4 took about 35 minutes.
Not 4 x as fast , but some improvement.
Sorry, I have not mastered timing in Julia yet
That is only one way of using multiple CPU’s in Julia.
It would probably have worked better if I had put theshell script loop inside Julia, and parallelised that loop, so that it could parallel process images.
Lots to learn on that front.

nevj · May 6, 2025, 11:34am

I have this Julia program

using Pkg
using DataFrames,CSV
farm = "Glensloy"
grade = "smooth"
for arg in ARGS
 df = CSV.read(arg,delim= ",",DataFrame)
 sheep = arg[1:4]
 field = arg[6]
 df.Farm .= farm
 df.Grade .= grade
 df.Sheep .= sheep 
 df.Field .= field
 df = select(df,[:Farm, :Grade, :Sheep, :Field, :Area, :Red, :Green, :Blue, :Count])
# cols = [:Farm, :Grade, :Sheep, :Field];
# df = select(df, cols, Not(cols))
 outfilename = string("lab",arg)
 CSV.write(outfilename,df)
end

which I use as follows

$ cd ~/juliawork/dermis.collagen.images/Glensloy/smooth
$ julia /home/nevj/juliawork/csvlabgs.jl *.csv

to add some data labelling columns to every .csv file in a directory.
Here is the result

A new lab3457_1.jpg.csv file corresponding to each 3457_1.jpg.csv file.
The labelled .csv files look like this

$ head -3 lab3448_1.jpg.csv
Farm,Grade,Sheep,Field,Area,Red,Green,Blue,Count
Glensloy,wrinkledbn,3448,1,181.0,0.8033233749664436,0.20620361517714916,0.22953362628228247,177
Glensloy,wrinkledbn,3448,1,132.0,0.8204894467841747,0.21903025353139685,0.2688250569987667,129
....

Note: I can use this program in a directory other than ~/juliawork because it does not use an environment.

The trouble is, that version only works for the ~/juliawork/dermis.collagen.images/Glensloy/smooth directory… the labels 'Glensloy 'and ‘smooth’ are hard coded into it.

It is too much trouble to have it find the labels by parsing the directory name… so I will just make a version for each of the 6 subdirectories.
So I do the other 5 subdirectories
then
I can start pooling all the individual .csv files to make one dataset.
Two steps

Pool all the lab*.csv files within each subdirectory

cd :~/juliawork/dermis.collagen.images/Manton/smooth

head -n 1 lab3506_1.jpg.csv > all && tail -n+2 -q lab*.csv >> all

where lab3506_1.jpg.csv is the first labelled .csv ffile in that subdirectory… head copies the header line, then tail copies all files without the header line being duplicated.
Repeat for each of 6 subdirectories

Combine the ‘all’ files. I will first omit the on-wrinkle results ( I will compare on-wrinkle and between-wrinkle samples later)

cd ~/juliawork/dermis.collagen.images

head -1 Glensloy/smooth/all > expt1.csv && tail -n+2 -q Glensloy/smooth/all >> expt1.csv && tail -n+2 -q  Glensloy/wrinkled/between/all >> expt1.csv && tail -n+2 -q Manton/smooth/all >> expt1.csv && tail -n+2 -q  Manton/wrinkled/between/all >> expt1.csv

$ ls -l expt1.csv
-rw-r--r-- 1 nevj nevj 11598408 May  6 21:32 expt1.csv

Yes, it is 11.5Mb …not huge but quite a bit of data

Now I am ready for an analysis.

Note : there are better methods of pooling .csv files.
One is to use the program csvstack from package csvkit
I will be writing about that separately.

Pirx · May 10, 2025, 9:41am

I have used Julia with Geany. All you need to do is pointing Geany’s build commands to Julia binaries of you can create link to the binaries in your /usr/local/bin directory and you can enjoy Julia with syntax highlighting.

Julia is one of the easiest languages to use. I’ve mainly used it with Raylib. What’s great about Julia is the community. The forum is very busy and you can see that there are many people interested in Julia’s development.

nevj · May 10, 2025, 1:05pm

I am using vim. I have julia syntax highlighting from a vim plugin.
I did get as far as having vim transfer code to the julia REPL but it was so cumbersome I gave up . I either copy/paste code in and out of REPL , or use a .jl file which I can execute wuth julia command.

Geany looks interesting. It is simple… not a bloated thing like VScode.

As a language, I agree. The only issues I seem to have are with types
but
some of its libraries are poorly documented
I found the image analysis packages OK, but the statistical packages are driving me nuts, especially ANOVA.jl. There is only one example of using the anova() function, there are 4 different ANOVA.jl repos on Github, I have no idea which one you get when you say Pkg.add ANOVA. The help page for anova() is not the same as the version with the example. The whole thing needs a cleanup and proper documentation with examples.

There is concern, with both Julia and Python, that they download libraries from Github.
Any library could contain malicious code. Where is the checking?
R is different… it has a closely managed repo for library code. You can get safe downloads in R.
Other languages generally install libraries on your computer, along with the compiler, so they are intrinsically more safe.

I went to julia because I am wanting to parallelise some existing R code. I have not got to doing that yet… need to practice with Julia first. Hence the little imaging project. It is not an end in itself, it is a practice pen.

Pirx · May 10, 2025, 1:17pm

If you’re looking for a simple functional text editor with syntax highlighting and buttons to run/compile programs within then your search is over.

Do you mean the fact that Julia is dynamically typed? Yes, some people don’t like this, as it may lead to bugs, but it mainly matters in case of large projects and I’m not sure Julia is the best choice in that situation.

pdecker · May 10, 2025, 1:33pm

Python can download from Github, but by default it installs packages from PyPI, the Python Package Index. That is somewhat supervised. It’s open for anyone to contribute, but they do remove malicious code if found. They have protections against third parties modifying packages.

There are always pluses and minuses to open source. If R is more of a walled garden, it could limit flexibility or extensibility but be safer.

nevj · May 10, 2025, 1:43pm

No, I am used to that… R is dynamically typed.
I mean Julia is nominally dynamically typed, but every time I go to do something I seem to have to specify types or do conversions. Images , in particular, have some strange types that hoodwink you into thinking you are dealing with a (0,1) Float but in terms of how it is stored it is a (0,255) Int. You think you can do arithmetic on them, but you cant. I guess that is the package implementer, not Julia .

Pirx · May 10, 2025, 1:49pm

Must be. I’ve also used Raylib (it’s a simple library for developing games written in C for C mainly) with some statically typed languages when I had to do a lot of conversions anf keep an eye on types. It wasn’t the case with Julia.

nevj · May 10, 2025, 1:52pm

OK Python is a half way house… that is better.

You can get R packages from Github too… but everyone uses either CRAN or Bioconductor… because they are comprehensive and safe.
R is very strict. To get a package into CRAN you have to pass various screenings. One of the requirements is documentation with examples… and they test the examples against your code… they have to execute.They test code on several Linuxes, Win, and Mac, and on various architectures.

I never thought of the library sharing side of open source… until I discovered with alarm that every time I used some library in Julia, it went hunting around the internet for a download. I am used to libraries coming with the compiler and having my own copy.
One side effect is you cant really use Julia without the internet. I think that is less so in R and Python.

pdecker · May 10, 2025, 1:56pm

I think I agree with that. Python has always been a “batteries included” language. It was created before the internet and you had to include everything. Over the years there was a predecessor to PyPI called the Cheese Factory or something like that. Python was named for Monty Python. The Cheese Factory is a reference to one of their famous skits. They even have binary packages as part of some libraries that use what are called “wheels” in reference to a wheel of cheese.

Python: fun and entertaining.

nevj · May 10, 2025, 11:33pm

That might appeal .
R is more a classic language… all work no play
I am not sure what Julia is… a mathematician’s idea of fun is not everyone’s cup of tea.