The C Programming Language: Some Initial Thoughts

Akatama · April 5, 2024, 11:23pm

Hey everyone!

Recently, I decided to learn to use the C programming language. I had a few reasons for doing so:

A lot of very good programmers say that learning C will help you be a better programmer
While it is not used as much as it once was, it is still a very significant language, especially used in things that touch hardware like drivers and kernels.
It is significant historically because of its heavy use in Unix, Linux and BSD.
It is actually a rather small language, so learning the basics can be done rather quickly. This one is pretty significant for me right now, as my second daughter was just born around 2 weeks ago.

Learning the Language
First, I looked up the best way to learn C. Keep in mind, when I was in school I did learn and use C++ (although it has been a good 5 years since I have used it) and I do use C# in my work every day. As a result, I already am fairly familiar with the syntax.

I just did a search on DuckDuckGo for places to learn C, and I found learn-c.org. This course is constructed primarily for people who already know how to program, and it quite nice. It will teach a concept like pointers, and then have you do a little exercise so you can try it out online. This course could easily be completed in an afternoon if not less, but because of my new daughter, this course took me about a week.

Trying it out
I wanted to try out something simple with C. I follow a guy named John Cricket on Twitter who runs a website called CodingChallenges. These are nice because they are somewhat guided, but not so much that the answers are spoiled. If you do need additional help, you can search GitHub or GitLab and you might even find people who are using the same language as you to solve these challenges.

The project I just finished today was a replacement for the Unix command line utility wc. At first, I was able to do pretty well, but due to it being a new language for me, I started to have some problems when dealing with “wide characters” and reading from standard input. I eventually found a GitHub repo of someone who goes my ShellMonk who also solved it in C, and that helped. In the end, I also compared how I had solved problems with how they solved problems. If you want to see the repo for my solution you can see it here. I have also credited ShellMonk, as in the end my solution and their solution look almost exactly the same.

Overall, this project was great. No only did I have to learn how to read options, filenames, open a file and deal with standard input, but I also had to make extensive use of C’s implementation of strings. It was a great first project to learn C.

My Thoughts
C is a bit of a strange language. Many things that are hard in C or at least take some thought are taken for granted in newer languages. This is good and bad, of course. C is more efficient in terms of computer resources than those languages, but it requires more effort from the programmer in exchange. Also, in terms of memory management, C can do many things are impossible or hard in these newer languages. Which is why C is still used for applications that need efficiency over everything, like the Linux kernel.

I have heard some people recently say that everyone who learns programming should do so first with C or C++. I can definitely see where they are coming from. Those languages are very low level, and having some knowledge of them is very helpful in the long run. However, since these languages are also harder to use than many newer languages, it also threatens to scare some people away from programming. In the end, I would say learning these languages should happen after a person learns a first, newer language, to get their feet wet.

Where next?
When I have time, I might try a few more coding challenges with C. I still haven’t decided if I even really like programming in the language.

I also plan to take this course that will help with low-level programming in C.

nevj · April 5, 2024, 11:55pm

I agree…
I used Pascal before attempting C. That definitely helped.

The other thing that helped was lots of practice… I used C for a large image processing project.

One of the hardest things to do is to understand someone else’s C code. C programmers dont seem to understand the need for comments, and they use a whole spectrum of complicated styles. Dont contribute to these issues… keep it simple and comment your code so people can follow it.
C does not force you to write clean code, it is up to you.

pdecker · April 6, 2024, 12:55am

I agree with that too.

Akatama · April 6, 2024, 7:47pm

There has been a push recently, from what I understand came from a comment Linus Torvalds had said to limit comments.

My understanding of it is that comments do not save bad code, and that comments should be for situations where code is complicated or doing something nonstandard. E.g. iterating through an array with a for loop likely does not need any comments. If you are using that for loop to sort the array it also likely does not need a comment, especially if this for loop is contained in a function called sort_array. If you named the function instead call_jeff or search_largest or whatever then your code is bad and needs to be fixed.

However, if you are doing something in that for loop that is not simple and is something that would require intimate knowledge of the project to understand, then a comment or some documentation is warranted.

Also, programmers should take into account who is viewing the code when writing documentation or comments. At my job, pretty much only my team views the code we write, and we take great effort to name our functions and tests good names so what they are trying to do is apparent without comments, along with documentation on how these things should be named. We still write comments and documentation for things that work a bit different than someone expects or that require very specific knowledge.

An open source project, especially one that will be used by end users should have more comments and documentation. Although with the xz stuff maybe there should be a push for more documentation so people with less specific knowledge can look into more projects easily. I can imagine there would be push back because that would be something that would require significant work, and many of these projects have few contributors and maintainers that are already familiar with the code, and they are already stretched thin on what they can do.

nevj · April 6, 2024, 11:54pm

That is a good policy.

I mostly work alone. While I am writing code I put in a lot of comment, just to remind myself what i am doing. Then I take most of it out when finished to reduce the clutter.

Variable names are an issue. I mostly program statistical methods. The more general purpose a function is, the greater
the tendency to use mathematical notation , like calling a variable X. I dont think that is bad… it indicates that the
function is general purpose, ie that X can be any real number.
On the other hand, in a very specific application , it may be better to name a variable midday_temperature than to name it X

Akatama · April 7, 2024, 6:20am

There are examples where naming a variable a single character is okay, bu these situations are rare. For example, it is considered standard practice to name the temporary variable in a for loop ‘i’ and in a nested for loop ‘i’ for the outer one and ‘j’ for the inner, ect.

I did take a stats class during my masters’ degree, but it has been over 5 years since I have used any of it. As a result, my current knowledge of statistics is rather pedestrian. You have a PH.D, so you did this stuff for a living. However, if we are just taking general mathematical forumals, I can offer some tips.

Let’s take the famous equation:

E=mc^2

Now, if you were to write a function in code that solved this equation for a given value m, then you could probably get away with just naming the variables m and c (E is what the function would return, so you wouldn’t necessarily need to name any variable for it). For the purposes of this example, let’s see how we can improve it.

c is equal to the speed of light, and while I understand there is some debate amongst physicists whether the speed of light is constant or not, for our purposes let’s just say it is. This is the perfect case for an immutable variable (denoted as const in many programming languages). Normally, these immutable variables are written in all caps, so we could call it SPEED_OF_LIGHT. m is the mass of the body that we are analyzing, so a better name is mass. So we get a function like this.

long long calculate_energy(long long mass)
{
     const long long SPEED_OF_LIGHT = // speed of light value here
     return mass * SPEED_OF_LIGHT * SPEED_OF_LIGHT;
}

For the average person, a situation where x, y or z are okay variable name would be situations that have to do with graphing on a plane or planes. These are accepted names for variables related to the location of the points you are examining. Often, people will even make a class called Point that has the names x, y and z as properties of that class. In that situation, your variable that is of type Point would probably have a name related to what you are looking at that would enlighten what you are trying to do. For example average_temperature, where y is the average temperature for that time, and x is the hour that the average temperature was calculated. Depending on how many of these average temperatures you have, it might be a good idea to name them even more explicitly, like average_temperature_9am, or put them in some kind of array or list that has all your average temperatures in chronological order.

That said, it does depend on who you are writing this code for. Yourself? Other people who are doing scientific computing? In that situation, using different variables names than what is used in well-known equations might be more confusing that enlightening. The type of programming you are doing is not what I would call typical. As a result, for any future people reading this thread, I cannot recommend using x as a variable name except in possibly this niche case.

nevj · April 7, 2024, 11:24am

You are right , scientific computing is a backwater. It tends to have lots of ad hoc code for one-off calculation problems. It does not really matter if that sort of code is untidy.
But, there are also general purpose mathematical routines. They mostly use short meaningless variable names that come from the original papers.

There is also a complexity issue
Have a look at this R code

# components
  for(iv in 1: v) {   # component iv
    for(il in 1: l) {   # trait il
      # V(fract)
      ib <- (il-1)*l + il    # block no - col of siga for VCii
      vfract[iv,il] <- varz(
                       varlz(vsiga[(ib-1)*v+iv,(ib-1)*v+iv], siga[iv,(il-1)*l+il])
                + varlz(vt(v,l,vsiga,il,il), siga[v+1,(il-1)*l+il])
                - 2.0*covlyz(covcit(v,l,iv,vsiga,il,il), siga[iv,(il-1)*l+il],
                                                         siga[v+1,(il-1)*l+il]),
                                                         fracta[iv,il])
  ....... I cut it here

It is R, not C
If you write something like that with long variable names, it goes over pages.
Those array indices have to be short.
You can see some of my stray comments that should be removed.

There is also the historical use of Fortran on punched cards. Card columns were
a scarce resource, so you kept variable names short , and early Fortran only allowed 6 character variable names. Some of us oldies never escaped the habit.

Here is a typical bit of old Fortran

c                                                                       
c     ..................................................................
c                                                                       
c        function iloc                                                 
c                                                                       
c        purpose                                                        
c           compute a vector subscript for an element in a matrix of    
c           specified storage mode                                      
c                                                                       
c        usage                                                          
c           iloc (i,j,n,m,ms)                                           
c                                                                       
c        description of parameters                                      
c           i   - row number of element                                 
c           j   - column number  of element                             
c           n   - number of rows in matrix                              
c           m   - number of columns in matrix                           
c           ms  - one digit number for storage mode of matrix           
c                  0 - general                                          
c                  1 - symmetric                                        
c                  2 - diagonal                                         
c                                                                       
c        remarks                                                        
c           none                                                        
c                                                                       
c        subroutines and function subprograms required                  
c           none                                                        
c                                                                       
c        method                                                         
c           ms=0   subscript is computed for a matrix with n*m elements 
c                  in storage (general matrix)                          
c           ms=1   subscript is computed for a matrix with n*(n+1)/2 in 
c                  storage (upper triangle of symmetric matrix). if     
c                  element is in lower triangular portion, subscript is 
c                  corresponding element in upper triangle.             
c           ms=2   subscript is computed for a matrix with n elements   
c                  in storage (diagonal elements of diagonal matrix).   
c                  if element is not on diagonal (and therefore not in  
c                  storage), ir is set to zero.                         
c                                                                       
c     ..................................................................
c                                                                       
      function iloc(i,j,n,m,ms)                                         
c                                                                       
      ix=i                                                              
      jx=j                                                              
      if(ms-1) 10,20,30                                                 
   10 irx=n*(jx-1)+ix                                                   
      go to 36                                                          
   20 if(ix-jx) 22,24,24                                                
   22 irx=ix+(jx*jx-jx)/2                                               
      go to 36                                                          
   24 irx=jx+(ix*ix-ix)/2                                               
      go to 36                                                          
   30 irx=0                                                             
      if(ix-jx) 36,32,36                                                
   32 irx=ix                                                            
   36 iloc=irx                                                          
       return                                                            
       end

All set out in columns, documentation in comment cards at the top,
I am sure you could rewrite that and make it intelligable to normal humans.
Dont try , it is not worth it.

The two big issues in Fortran were array indexing, and I/O. Modern languages have removed that, and introduced other headaches.

Akatama · April 7, 2024, 8:04pm

Well, you did pick two programming languages that I am not familiar with (although I understand that they have some use in scientific computing). But you are correct that I cannot parse them. The comments on the R example do not do much to enlighten me on what this code does. What is “component iv”? I assume it is some kind of list or array. “block no - col of siga for VCii”, no idea. If these comments are helpful for you, at least they are helpful for someone.

About a month and a half ago, while one of my coworkers was on leave, I got a bug that he had written back. Since I was his backup, I had to look into the bug. I kind of thought I knew what his code did, but then I saw this comment:

// -1 27, -3 29

The comment confused me so much, that I immediately became unsure of what his code was doing. So, as a warning, comments can also lead people astray on what the code does, if they are also not well written. And you do need to take into account on who will be reading the code. If siga and VCii is something used often in your field, then it is probably a fine comment.

I have heard of these variable name restrictions in older languages or older versions of languages. Nowadays, there might be a variable name restriction of some kind, but it is probably something ridiculous like 128 or 256 characters, which is much longer than a good variable name should be, at least in most cases.

Another thing to consider, is that old IDEs (if IDEs were even used, as many people were programming in essentially simple text editors) had no or bad intellisense. However, for at least as long as I have been programming, intellisense will pick up if you are typing a variable name and suggest it to be autocompleted. This helps programmers pick more descriptive variable names as then we do not need to fully type them out each time.

Also, modern IDEs allow for each renaming of variables, functions and classes. In fact, they are excellent at this. You either select what you want to do from a drop down menu or hit a shortcut, rename the variable, function or class, and the rename will be applied in all files that it appears in. This makes refactoring code much, much easier.

nevj · April 7, 2024, 11:22pm

Yes, I still program in vi ( not even vim)
Cant stand autocomplete or highlighting
but I do use global replace a lot
:g/old/s//new/g
it works like sed or ex, but inside vi.
showing my age

I agree those comments are not helpful. I cant even remember fully what they mean myself. I do know the variable names… they are, as you guessed, common in that field of work.

Akatama · April 11, 2024, 6:32am

I hope you didn’t feel I was too harsh on your variable names, Neville.

The truth is, naming things in code is very hard, and especially if you have some code that is having changes, what was once a good variable name could be a bad one in the future.

For example, at my job I primarily write automation tests for our code (my job title is SDET - Software Development Engineer in Test). We had a few different users. One of these we just called User, because it was the default one (maybe defaultUser is a better name, but we always call our default user this name). Another one we called ChecklistUser, because it was a user that had access to a feature we called checklist.

However, I wrote that variable name for that user sometime in 2019. In the meantime, the checklist feature became something my company gave to every user, so now the default user also has access to that feature. Last summer a coworker was working on adding some tests, and he used the ChecklistUser. When he was done, he asked me to look at his code. I said “You used the wrong user” and he said “What user should I have used?” and then I had to explain why it was called that, and I realized the name was now very bad. So I had to rename it, and rename it everywhere.

There is a very famous saying I have heard a lot:

“There are only two hard things in Computer Science: cache invalidation and naming things.” - Phil Karlton

I don’t know if I would say those are the only two hard things, but those things are definitely hard, and they are ALWAYS hard. They never get any easier.

nevj · April 11, 2024, 12:21pm

Not at all. I was only giving an historical perspective. I am better than that today.
And thanks for your story. It does highlight that the meaning
of things can change.
I know of no language that helps with naming variables. … except for providing adequate length. Maybe object oriented
languages tend to guide you .?

Akatama · April 11, 2024, 4:12pm

It is kind of a wash. Object oriented languages definitely can help with naming due to inheritance, assuming you do it right. But object oriented languages can also make things messy with poorly named classes quickly. This is why we need guidance like SOLID principles to help keep us on track with our code.

I am definitely more of an object oriented programmer, although I have used Scala in the past which is a functional language and I love some of the functional features in C#. As object oriented codebases have grown, people have begun to shift a bit towards functional programming is better. Its not true exactly, they are both great for their own use cases.

For example, there is a somewhat common problem on object oriented programming, where you just really want a function, but you in many object oriented languages you can only define functions in a class. So you get a weird, too specific and long classname. This is one way where object oriented codebases can become unwieldy. Use the right tool for the job, if your programming language or framework allows for ot.

nevj · April 11, 2024, 5:34pm

I am more functional . That was all that existed when I learnt to program.
You can use plain C in an object oriented manner… it just does not force you to, and it provides little help. … no classes or inheritance but you can do that by hand.

As you said, use the right tool for the job.
R is semi object oriented… it has classes and inheritance is automatic. I think the right tools for scientific programming today are R and Julia, but people in some fields still use Fortran. Fortran has morphed into a modern language.

pdecker · April 11, 2024, 11:04pm

The version I like is:

“There are only two hard things in Computer Science: cache invalidation, naming things, and off by one errors.”

Akatama · April 12, 2024, 12:41am

I think there are some very nice languages recently that are more functional focus. I learned a bit of Go around December (I want to get back to it soon). Go does have some OO stuff, but it is mostly a functional language. For example, Go does not have inheritance. In fact, it is a bit like C in how it handles classes. Is Go object oriented

That is a good one!

nevj · April 12, 2024, 4:11am

There are often Go programming articles in Linux Magazine.