Assert your assumptions - .NET Core and subtle locale issues with WSL's Ubuntu
I thought this was an interesting and subtle bug behavior that was not only hard to track down but hard to pin down. I wasn't sure 'whose fault it was.'
Here's the story. Feel free to follow along and see what you get.
I was running on Ubuntu 18.04 under WSL.
I made a console app using .NET Core 3.0. You can install .NET Core here http://dot.net/get-core3
I did this:
dotnet new console
dotnet add package Humanizer --version 2.6.2
Then made Program.cs look like this. Humanizer is a great .NET Standard library that you'll learn about and think "why didn't .NET always have this!?"
using System;
using Humanizer;
namespace dotnetlocaletest
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(3501.ToWords());
}
}
}
You can see that I want the app to print out the number 3051 as words. Presumably in English, as that's my primary language, but you'll note I haven't indicated that here. Let's run it.
Note that app this works great and as expected in Windows.
scott@IRONHEART:~/dotnetlocaletest$ dotnet run
3501
Huh. It didn't even try. That's weird.
My Windows machine is en-us (English in the USA) but what's my Ubuntu machine?
scott@IRONHEART:~/dotnetlocaletest$ locale
LANG=C.UTF-8
LANGUAGE=
Looks like it's nothing. It's "C.UTF-8" and it's nothing. C in this context means the POSIX default locate. It's the most basic. C.UTF-8 is definitely NOT the same as en_US.utf8. It's a locate of sorts, but it's not a place.
What if I tell .NET explicitly where I am?
static void Main(string[] args)
{
Thread.CurrentThread.CurrentUICulture = new CultureInfo("en-US");
Console.WriteLine(3501.ToWords());
}
And running it.
scott@IRONHEART:~/dotnetlocaletest$ dotnet run
three thousand five hundred and one
OK, so things work well if the app declares "hey I'm en-US!" and Humanizer works well.
What's wrong? Seems like Ubuntu's "C.UTF-8" isn't "invariant" enough to cause Humanizer to fall back to an English default?
Seems like other people have seen unusual or subtle issues with Ubuntu installs that are using C.UTF-8 versus a more specific locale like en-US.UTF8.
I could fix this in a few ways. I could set the locale specifically in Ubuntu:
locale-gen en_US.UTF-8
update-locale LANG=en_US.UTF-8
Fortunately Humanizer 2.7.2 and above has fixed this issue and falls back correctly. Whose "bug" was it? Tough one but in this case, Humanizer had some flawed fallback logic. I updated to 2.7.2 and now C.UTF-8 falls back to a neutral English.
That said, I think it could be argued that WSL/Canonical/Ubuntu should detected my local language and/or set locale to it on installation.
The lesson here is that your applications - especially ones that are expected to work in multiple locales in multiple languages - take "input" from a lot of different places. Phrased differently, not all input comes from the user.
System locale and language, time, timezone, dates, are all input as ambient context to your application. Make sure you assert your assumptions about what "default" is. In this case, my little app worked great on en-US but not on "C.UTF-8." I was able to explore the behavior and learn that there was both a local workaround (I could detected and set a default locale if needed) and there was a library fix available as well.
Assert your assumptions!
Sponsor: Suffering from a lack of clarity around software bugs? Give your customers the experience they deserve and expect with error monitoring from Raygun.com. Installs in minutes, try it today!
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
From outside the U.S., things inside the U.S. are very hectic. I mean Americans don't use metric or A4 papers, write the month before the day in dates, and even when they say they are using metric, they still use Fahrenheit instead of Celcius. And since Microsoft is an American corporation, one must always be careful not end up with wrong locale settings.
Per my father, the USA doesn't use metric because President Carter in the 1970s tried to force it on the country and its schools. People revolted and did not accept it; and given the federal power abuses earlier in the 1970s, the USA did not go with a metric only taught in schools.
Per my father, his much younger co-workers are very weak when using fractions as they were taught decimals and a 50/50 mix of English and metric in school. He only started using metric in his 20s at his first job.
It's up there, per him, with how FDR added by stroke of the pen the 12th grade in high school to keep tens of thousands of 18 year olds out of the workforce during the Great Depression. His father graduated high school at 16 as he skipped an entire grade along which all of his classmates in the 1930s.
My locale is en-GB and timezone is GMT/BST. The problem we quite often run into is that because it's "close enough" to en-US and UTC it's usually quite late before problems with where the input is coming from is picked up.
I recently ran into a very similar problem running a Windows Docker container, which defaults to en-US and UTC - p.s. still haven't actually managed to figure out how the set the locale properly on the container image! Date parsing is particularly problematic because everyone likes to write their dates slightly differently (dd-MM-yyyy vs. MM-dd-yyyy that type of thing).
Handling the changes in code would be nice, but sometimes it's not possible, especially if you're integrating with an application that's relying on OS calls which read the system locale/timezone. That's why it's important to make sure that the underlying OS is configured "correctly".
In Linux you've got the command line that can set the locale and time zone but in Windows it seems to be a lot harder to this via the command line!
I know a lot of .NET devs who have only worked with Windows, and have balked at some of the things that Linux/Unix do or how they handle seemingly "standard" things (line endings, permissions, mount points, etc.). Some of these devs have pushed back and demanded that their apps only run on Windows, or refused to support non-Windows based OSs; whereas most of them have accepted that their assumptions need to change slightly, going forward.
Then again, I think that things like this will lead to more stable software with fewer assumptions (to quote Coach Smiley, "never make an assumption, because you will look like an ass and the ump with tion you"), and more in depth tests.
Comments are closed.