Wanting to convert an ugly formatted text file to a CSV or something

Can’t think of how to do this…

This is a log file / trip meter from a Speedo / Odometer Android app I use on my e-bike…

I think it “prepends” new entries at the top (instead of top down).

Here’s a sample at the bottom of the file, for one trip :






Trip name
29 Oct 2023 ➠ 2:00 pm

Trip started at
10/28/2023➠04:06:39 PM

Starting Latitude,Longitude
REDACTED

Trip ended at
10/29/2023➠02:00:12 PM

Ending Latitude,Longitude
REDACTED

Total trip time
21:53:32

Moving time
2:33:44

Idle time
19:19:48

Distance
14.66 km

Average speed (Moving)
6 km/h

Average speed (Total)
1 km/h

Maximum Speed
90 km/h

Maximum speed (Original)
90 km/h%

And NO! I wasn’t doing 90kmh on my e-bike - fastest I’ve been on it was a tad over 40 km/h downhill…

Each log entry is separated by FIVE (5) carriage returns…
I’d like to :

  1. replace single LF (CR/LF) with a comma (but NOT 5 CRLF).
  2. replace 5 consecutive CR/LF with a single LF.

But how to do that?

I guess I could replace 5 consecutive CR/LF with some arbitrary character (e.g. “#”). Then replace ALL CRLF with “,”, then add “\n” or CRLF to e.g. “#”…

But I’ve no idea how to accomplish any of this… I might have better luck in a word processor perhaps?

What a stupid format to save the info in! How can you graph or collate it?

Ideally - I’d like things like “Total trip time” as “column” headers, with the data for each trip on a single line, with values in CSV…

1 Like

Agree. I would write a C program. I know you will not like that idea. Maybe python or awk?

It has the form
keyword(s) lf
data lf lf

You might be able to use that to just put data in csv file, omitting the keywords

Have a look at tr
it may be able to help with the cr/lf’s, for example

$ cat uid.txt

My 
UID 
is 
1000

$ tr "\n" " " < uid.txt

My UID is 1000
2 Likes

I would use a word-processing tool to open the file
Then use find and replace to convert returns to CSV. You may need to search special and 2 returns to CSV
Save it in csv format

Then use spread sheet to open the csv and convert it from csv to columns
Then you can do your calculations as needed

But it may be quicker just to write them in a spreadsheet manually

Hard work… I just cycle for pleasure now

2 Likes

I do it for pleasure too… But I want to know what my expected ranges might be : e.g.

ride in 5th gear (7 speed rear derailleur) on pedal assist 2
ride in 4th get on pedal assist 5
(pedal assist 5 is the maximum)
I get severe “range anxiety” (it’s a common term with PEV [personal electric vehicle])

The manufacturer of the e-bike claimed an impossible range of 110 km - my ARSE!

Thing weighs a hefty 35+ kg or so… No way am I going to ride it on PA (pedal assist) 1 in a high gear… never mind with no pedal assist (it came with a twist throttle too - but I physically disabled that, and also disabled it in the firmware - it’s illegal to have a throttle e-bike over 200 W motor here).

I can only figure out those averages if I can collate each trip as a row in a spreadsheet…

But I need to break up that hideous log file into use-able data… Not about to learn “C” to solve this - I’m closer to learning Python than C - and that’s very very rudimentary…

I think I can try and figure something out with a bunch of grep, and maybe awk, and maybe sed… I don’t think LibreOffice would do it - I have MS-word - it’s better at search and replace than LibreOffice Writer (which seems more geared to Asian languages and character sets).

I also want to buy a new battery - so I can do some longer trips…

The single guy who wrote the app - is in New Delhi… I paid for a Pro license… I should be able to get some use-able data from a “Pro” app… but apparently not…

2 Likes

Never tried in libre office just word 6.

On electric bikes, my wifes is almost dead, she gets around 10 km per charge now, almost not worth riding.

I am wondering about a folding bike with little wheels instead of 26 inch. Plus buying a spare battery. She will be 65 shortly and thinking if its a better bet. Not sure you have décathlon in your area but they do 700 euro bikes which is enough to pay for what she does. I may buy the same for me without electric as need the exercise.

Any thoughts

IMHO it’s clearly an awk task.

First off, I convert the file with dos2unix or tr to Unix line endings.
The result is converted to CSV format with my little script.

$ awk -f 2csv.awk in.txt > out.csv

The result isn’t really complete, as the units aren’t split off and the date/time formats need some adjustment.

"Trip name","Trip started at","Starting Latitude,Longitude","Trip ended at","Ending Latitude,Longitude","Total trip time","Moving time","Idle time","Distance","Average speed (Moving)","Average speed (Total)","Maximum Speed","Maximum speed (Original)"
29 Oct 2023 ➠ 2:00 pm,10/28/2023➠04:06:39 PM,REDACTED,10/29/2023➠02:00:12 PM,REDACTED,21:53:32,2:33:44,19:19:48,14.66 km,6 km/h,1 km/h,90 km/h,90 km/h%
29 Oct 2023 ➠ 2:00 pm,10/28/2023➠04:06:39 PM,REDACTED,10/29/2023➠02:00:12 PM,REDACTED,21:53:32,2:33:44,19:19:48,14.66 km,6 km/h,1 km/h,90 km/h,90 km/h%

Here is my script 2csv.awk:

BEGIN {
	print	"\"Trip name\"" "," \
	"\"Trip started at\"" "," \
	"\"Starting Latitude,Longitude\"" "," \
	"\"Trip ended at\"" "," \
	"\"Ending Latitude,Longitude\"" "," \
	"\"Total trip time\"" "," \
	"\"Moving time\"" "," \
	"\"Idle time\"" "," \
	"\"Distance\"" "," \
	"\"Average speed (Moving)\"" "," \
	"\"Average speed (Total)\"" "," \
	"\"Maximum Speed\"" "," \
	"\"Maximum speed (Original)\""
	;
	i = 0;
}

function save() {
	getline;
	arr[i++] = $0;
}

/Trip name/ {
	save()
}

/Trip started at/ {
	save()
}

/Starting Latitude,Longitude/ {
	save()
}

/Trip ended at/ {
	save()
}

/Ending Latitude,Longitude/ {
	save()
}

/Total trip time/ {
	save()
}

/Moving time/ {
	save()
}

/Idle time/ {
	save()
}

/Distance/ {
	save()
}

/Average speed \(Moving\)/ {
	save()
}

/Average speed \(Total\)/ {
	save()
}

/Maximum Speed/ {
	save()
}

/Maximum speed \(Original\)/ {
	save()
	for (key in arr) {
		printf "%s%c", arr[key],key < 12 ? "," : "\n"
	}
	i = 0
}

HTH
abu

4 Likes

Hey @abu ,
That is a nice bit of awk programming.
It is good to see some of the original Unix skills are still around
Regards
Neville

3 Likes

That’s fantastic - thanks for taking the time and effort to solve this problem!

Much appreciated! CHEERS! :smile:


I’d never be able to get my head around that awk script… Very well done I must say - but the output leaves something to be desired - it correctly puts the column headers at the top of the file - but not all fields end up in the correct place - and still lots of newlines (in the same log record).

I ended up using sed to get each log entry into a single row - VERY long rows…

Replace ALL newlines with commas (end up with a HUGE long string on a single line) :
sed -i ':a;N;$!ba;s/\n/\,/g' ff.txt
Please don’t ask me to explain that - I found it on Stack Overflow :smiley:

So instead of one of the answers :
sed ':a;N;$!ba;s/\n/ /g' file
I did :
sed ':a;N;$!ba;s/\n/,/g' file
Odd that neither the space nor the comma need to be escaped - I’m so used to escaping space I did it from muscle memory (e.g. using (p)rename to remove spaces from a filename).

Then replace all instances of six commas - with a newline :
sed -i 's/,,,,,,/\'$'\n''/g' ff.txt
was rather surprised I didn’t have to escape those commas - but it worked…
That log entry I mentioned above now looks like this :

Trip name,29 Oct 2023 ➠ 2:00 pm,,Trip started at,10/28/2023➠04:06:39 PM,,Starting Latitude,Longitude,REDACTED,,Trip ended at,10/29/2023➠02:00:12 PM,,Ending Latitude,Longitude,REDACTED,,Total trip time,21:53:32,,Moving time,2:33:44,,Idle time,19:19:48,,Distance,14.66 km,,Average speed (Moving),6 km/h,,Average speed (Total),1 km/h,,Maximum Speed,90 km/h,,Maximum speed (Original),90 km/h

Unfortunately - I think there’s a different number of “columns” in some lines - as some “Trip name” have an arbitrary string I assigned it in the app - like this one :

Trip name,P2-lil2red2bas-1,,Trip started at,04/15/2025➠04:22:20 PM,,Starting Latitude,Longitude,REDACTED,,Trip ended at,04/15/2025➠06:14:08 PM,,Ending Latitude,Longitude,REDACTED,,Total trip time,1:51:48,,Moving time,1:06:07,,Idle time,0:45:41,,Distance,21.60 km,,Average speed (Moving),20 km/h,,Average speed (Total),12 km/h,,Maximum Speed,32 km/h,,Maximum speed (Original),32 km/h

Hmmm - maybe not - where I didn’t save the trip in the app - it defaults to date and time as the “Trip name”…

This is good - this is progress - I can work with this data now… I’ll probably do some more tweaking - e.g. replace the string “Latitude,Longitude” with something like “GPS”

sed 's/Latitude,Longitude/GPS/g' ff.txt

I’d rather not do all this tweaking by hand in a Word Processor or a spreadsheet - because I’ll be grabbing new data off my phone constantly… Then also remove empty columns (“,”) :
sed -i 's/,,/,/g' ff.txt

This is mostly scriptable and repeatable… Which was what I wanted…

I can then do some more tweaking e.g. take out those names and make a header row… But that’s a lot of strings to manage… But now it’s in columns - I might be able to use “cut” which I haven’t used for a couple decades now…

2 Likes

OK - figured out how to remove the field names from the body :

sed 's/Trip name,//;s/Trip started at,//;s/Starting GPS,//;s/Trip ended at,//;s/Ending GPS,//;s/Total trip time,//;s/Moving time,//;s/Idle time,//;s/Distance,//;s/Average speed (Moving),//;s/Average speed (Total),//;s/Maximum Speed,//;s/Maximum speed (Original),//' ff.txt

I didn’t realise you could do that with sed (i.e. multiple “nested” string replaces with semi-colon)…


So now I can load this into LibreOffice Calc and makes some guestimates based on km travelled, battery bars used, and anticipated range :

I’ll probably just keep using that spreadsheet file when I get new rows of data to append…

BTW - a couple of those estimates look wildy optimistic : e.g. the top row of 108 km - I don’t believe that for a minute… There’s some exponentially decreasing battery remaining process that chews up more battery, the less battery, you have left…

3 Likes

I also started with sed, but it ended up with too many different calls. Using awk, you can do it all in one run.

2 Likes

Thanks for your kind words.

1 Like

It was just my first take. As I mentioned, I didn’t do any further formatting and I couldn’t parse your REDACTED fields. Are you sure you did the line ending transform first? I tried without, but this didn’t work.

image

1 Like

The “REDACTED” fields have a comma in them - so will be tricky to parse - they’re GPS co-ordinates to my start, and end points - i.e. my home address :smiley: - so not going to share that on a public google searchable forum…

I wasn’t criticizing your input - it was interesting to read… and was fascinating to see when I saved your awk script with a *.awk extension - I got syntax highlighting in vim and Gnome Text Editor (gedit)…

Thanks for your interesting contribution - I’m grateful you took the time to look into my issue…

I’ve managed to resolve it using sed, and pretty much only, sed.

Also - I only gave one record in my original post - so - not much data to go on - as they say “garbage in, garbage out” :smiley:

Now I’m trying to figure out how to get the output file “Records.txt” (which incidentally - didn’t show any “^m” chars in it (it was generated on a Linux system - i.e. Android) and the file before, and after, running “dos2unix” didn’t change… So I’d say the app is generating correct text output (vs the abomination that is MS-DOS / Wintel text files)) - using TermUX - but I don’t have UNIX permissions to see files in the app output folder - I can only see them when using USB to my phone (from Linux). It’s odd - I can see all the files in my Phone camera’s output folder using TermUX (via ssh and rsync)…

it’s obvious the developers of the firmware made some error in their calcs… One of the reasons I started using a GPS based speedometer was I thought the report speed was wrong - and I was right - e.g. it reports I’m doing 23 km/h - but GPS has me at 25 km/h… But I did the exact same run as yesterday just now - and got home with battery indicator showing 1 bar (of 5) used…

Contrast that with my Segway Ninebot e-scooters - both of them - report the same speed as my GPS app… So Segway’s engineers are a lot more capable than the manufacturer of my e-bike (Cmacewheel - but branded Kristall in Australia).

2 Likes

I’m too old now to be writing code for each job of this kind. My latest was converting a book written in W*rd to LaTeX, via LibreOffice Writer - it’s in French, which has its idiosycracies in wordprocessors as well as in LaTeX.

More and more, I’m using Notepad++ (native in W*ndows), with in particular the MultiReplace extension. Notepad can handle line endings and do CSV. I have the impression from the way it works that this application is frequently used by people who are paid to do this kind of work.

1 Like

I mostly use Notepad++ as a “paste bin” on the Windows jumphosts I access for work… In some cases - RDP nested 3 layers deep - and clipboard integration can be a bit flaky… I like to just press the “new” tab button in Notepad++ - and when I exit - then re-open it’s still got all the tabs I had open before - very useful…

One thing I kinda hate about Notepad++ is it’s not very friendly to the visually impaired…

1 Like

Remembering the times when I was forced to use Windows: Notepad++ was clearly one of my favorites. The first thing I installed when setting up a new Windows box.

2 Likes