When did we stop caring about memory management?
This post is neither a rant nor a complaint, but rather, an observation.
There's some amazing work happening over in the C#-based Kestrel web server. This is an little open source webserver that (currently) sits on libuv and lets you run ASP.NET web applications on Windows, Mac, or Linux. It was started by Louis DeJardin but more recently Ben Adams from Illyriad Games has become a primary committer, and obsessive optimizer.
Kestrel is now doing 1.2 MILLION requests a second on benchmarking hardware (all published at https://github.com/aspnet/benchmarks) and it's written in C#. There's some amazing stuff going on in the code base with various micro-optimizations that management memory more intelligently.
. @ben_a_adams I think it's safe to say you're on to something in that PR pic.twitter.com/ELIyxhYyun
— Damian Edwards (@DamianEdwards) December 23, 2015
Here's my question to you, Dear Reader, and I realize it will differ based on your language of choice:
When did you stop caring about Memory Management, and is that a bad thing?
When I started school, although I had poked around in BASIC a bit, I learned x86 Assembler first, then C, then Java. We were taught intense memory management and learned on things like Minix, writing device drivers, before moving up the stack to garbage collected languages. Many years later I wrote a tiny operating system simulator in C# that simulated virtual memory vs physical memory, page faults, etc.
There's a great reference here at Ravenbook (within their Memory Pool System docs) that lists popular languages and their memory management strategies. Let me pull this bit out about the C language:
The [C] language is notorious for fostering memory management bugs, including:
- Accessing arrays with indexes that are out of bounds;
- Using stack-allocated structures beyond their lifetimes (see use after free);
- Using heap-allocated structures after freeing them (see use after free);
- Neglecting to free heap-allocated objects when they are no longer required (see memory leak);
- Failing to allocate memory for a pointer before using it;
- Allocating insufficient memory for the intended contents;
- Loading from allocated memory before storing into it;
- Dereferencing non-pointers as if they were pointers.
When was the last time you thought about these things, assuming you're an application developer?
I've met and spoken to a number of application developers who have never thought about memory management in 10 and 15 year long careers. Java and C# and other languages have completely hidden this aspect of software from them.
BUT.
They have performance issues. They don't profile their applications. And sometimes, just sometimes, they struggle to find out why their application is slow.
My buddy Glenn Condron says you don't have to think about memory management until you totally have to think about memory management. He says "time spent sweating memory is time you're not writing your app. The hard part is developing the experience is that you need to know when you need to care."
I've talked about this a little in podcasts like the This Developer's Life episode on Abstractions with guests like Ward Cunningham, Charles Petzold, and Dan Bricklin as well as this post called Please Learn to Think about Abstractions.
How low should we go? How useful is it to know about C-style memory management when you're a front-end JavaScript Developer? Should we make it functional then make it fast...but if it's fast enough, then just make it work? The tragedy here is that if it "works on my machine" then the developer never goes back to tighten the screws.
I propose it IS important but I also think it's important to know how a differential gear works, but that's a "because" argument. What do you think?
Sponsor: Big thanks to Infragistics for sponsoring the blog this week! Responsive web design on any browser, any platform and any device with Infragistics jQuery/HTML5 Controls. Get super-charged performance with the world’s fastest HTML5 Grid - Download for free now!
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
Probably, what would be good to know that difference between then and now is memory speed.
Before, problem was reading from HDDs and everything that you could have in RAM was just fast.
Nowadays RAM is slow and CPU cache is fast. That means you have to think about memory allocation if you want to have top performance. Ironically, there were days when RAM was 640k and HDD was 1-20-100 Mb (pick any) and now CPU cache is about 1M and RAM is of Gigabytes scale. So history repeats itself.
It's just not everyone wants to have performant app. Good developers do.
Is the same idea of an elephant that don't try to break their chains, someone in someway put that thought in our mind and we believe without questioning.
Combining this with how we develop programs that run in machines with performance or memory restrictions, this concern just get lost.
"Let's stay in shoulder of giants and let GC do all this work."
Those who are experienced in C# will try and maintain memory usage and performance but most of the time it certainly not followed.
To midigate this lack of knowledge or effort by our developers we run memprofiler and dotTrace at the end of each release cycle where we are mostly looking for memory leaks and very slow processes.
I hope this gives some insight on where some of us are.
In the computer industry we too often try to say someone can do everything and that's not a mature viewpoint. We have entered an age of specialization as you do in any scientific field. It's just a sign of our industry maturing beyond hobbyist stage.
Garbage collectors protect us from a lot of the dangers of manual memory management; but they don't protect from the costs.
So even front end javascript developers they should know how much their code is really consuming memory.
Nobody targets just the data that they actually need anymore and then they legitimately can't understand why their application is slow when they just pulled a 100 MB database into memory across the network down to the client side in JSON (works on my machine on my wired GB network at work!).
I have optimized many slow systems as my career has gone on. The understanding of what's happening inside the black box is becoming a lost art.
As to PCs: well, the build I'm doing right now has 64gb memory in it. If offends the old-school programmer in me, but it's cheaper to *not* think about memory management when it comes to single user systems.
Pete
I think between Garbage Collection and RAM not costing $45 a megabyte, we care less, which has made us careless.
I do .NET, iOS and Node.js development, and I run into memory problems all the time on all platforms. Mobile devices do not have as much memory to dedicate as servers do, so you find yourself having to be careful about memory management.
Good post!
Thanks Scott.
I was thinking about this topic recently while reading Ben Watson's book "Writing High-Performance .NET Code" (the GC chapter is fantastic!). While I try to keep performance/resource utilization in mind generally, I know I could do better in the perf department. That said, I think having some insight into how things work under the covers is sneakily helpful. It naturally steers you away from some of the more egregious memory management sins, even when you're not making an explicit optimization pass over your code.
https://en.wikipedia.org/wiki/Automatic_Reference_Counting
All the benefits of not caring (mostly, ARC has no retain cycle breaking) about memory management with none of the performance issues of garbage collection.
Swift is of course ARC only.
GC "memory allocation is like drinking alcohol, it's the hangover that hurts"
The amount of available memory, whether it's cheap or not, does not matter when the users push the limits. In the end user experience is what counts.
Sure for webservers/operating systems etc it does, bit for most things speed of development is far more important.
That's not to say that you shouldn't know what you're doing when you do it; and you need the knowledge to make informed decisions about how to best use the resources available to you.
In my experience, though, memory is relatively cheap, man-hours are relatively expensive, and .Net's GC is far better supported, developed, tested, optimised and generally made than a single dev possibly could.
This prompted me to spend the time and consolidate all my webjobs into a single console app (using the amazing WebJobs SDK) which cut my resource usage in half. Additionally, I have to think about memory a lot because in a continuous webjob you are not given the benefit of the console app shutting down after processing--memory sticks around and in a background job you're probably processing a ton of data--so you need to be extra careful how you bring down and work with the data. I noticed for example, a ton of memory being used by JSON.NET objects--it was because the database was deserializing one of my objects and one of the nested classes used up a ton of memory, even though I wasn't actually using that class since I only cared about the ID. So I switched to using raw JTokens instead and that drastically reduced memory usage.
So, tl;dr--when I moved to the cloud. Also my current role at work is managing infrastructure, so I care when apps on my servers use upwards of 2GB of memory, I sit down and a have a talk with the devs and take/analyze memory dumps.
Industry
With regards to memory, I think as an industry it is hard to say when memory became less important. it is almost like the whole "golden age" thing where people look back and think it has already been the best ever and it wont be like that again - except the inverse. Which is to say people used to have to use paper punch cards and now we have robots.
Me
What I can say from my experience is that in school memory was demoted with the release of the iPhone. It is perhaps a cliche thing to say, but in my opinion it is true. In school, as you reference, students go from basic on up to GC language and eventually they end up at mobile. Mobile used to be a precise endeavor.
I remember writing a program for a razor phone in Java, and it only had 32mb of memory available to use and no floating point! Compare that with writing a program for an iPhone where all the ns-bells-and-whistles were present and it is night and day.
Developing
Given all of that, on a day to day basis I am an asp.net mvc full stack developer. Developing for the web still makes me very conscious of memory, I think about it all the time. Web development, in my opinion, means that page load needs to be less than 2 seconds for every page. This means that dealing with milliseconds becomes second nature over time, and I frequently think of time spans in milliseconds as a result.
On the front end poor memory management will make your page either load slow, or react slow - especially on smaller devices or poor connections (I liked your post on lower bandwidth as well). Either loading or reacting slow will drive users away and cause headache.
On the back end, memory management is also very important. It can mean the difference between needing a full support caching server or not, as well as keeping database queries limited (which ties into load speeds).
Algorithms
As data becomes larger and larger, analyzing it requires more and more processing power. For example, in "big data" the time complexity for analysis may be very efficient however even touching every element once is terrible performance. This has led to spacial complexity algorithms becoming prevalent where building a massive hash set allows for easier access.
The increase in spacial complexity considerations drives the demand for larger available memory in general, and as a result this means that it is more readily available in other venues.
tl;dr;
I think small devices having large amounts of available memory led to developers considering memory management less (almost no one does DMA, direct memory access, anymore for graphics on small devices). However, the web is still very memory sensitive.
Then it became a case of variable scope management, optimising for performance is now normally a case of using more memory for caching to reduce fetch times.
This week it has become an issue on Azure with a background worker running multiple parallel imports encouting memory/performance issues, solution was to limit the number of parallel imports.
I don't miss the days of HIMEM.SYS and dangling pointers, so having a garbage collected runtime makes my cognitive overload go down and frees me for other concerns.
These days, I worry more about following good practice in the rate of allocations instead of the size of them. In short, the allocations (which leads to GCs) become more important than the memory in use. We went from caring about char[] to knowing when to use StringBuilder instead of String. We went from caring about alloc/free to knowing when to cache pools of pre-allocated arrays. Sure, most junior-developers or throw-away CRUD app or small system don't need to care, but training them to do it right prevents incurring the technical debt when it DOES suddenly matter. It's the same in the "front-end" world, where it pays to remember that doing
var $foo = $('#foo');
// use $foo lots
Beats this every time
$('#foo').thingone();
$('#foo').thingtwo();
This doesn't make us sloppier any more than the ability to just draw on a canvas makes us lazy. When you get to the performance (in time OR quality) of those tasks it still pays to understand all the transitions and mappings occurring. Think about all the memcpys and math taking place between loading a .JPG into memory, mapping it to a canvas, scaling the canvas to the window, BLTing it to the DC, DC pushing to framebuffer, framebuffer reads to video controler, controller to HDMI, HDMI to buffer, buffer to LCD array... it's a wonder anything ever appears on-screen.
#GetOffMyLawn
I've met and spoken to a number of application developers who have never thought about memory management in 10 and 15 year long careers
I've been developing in C and C++ for the last 15 years. I guess I'm now part of the old guard.
I think over the next 10 years we're going to have to get back into this stuff more fundamentally, too. The new wave of ultra-small, ultra-cheap hardware that's going to saturate the world in the form of wearables and ubiquitous sense-and-react devices is getting back to RAM measured in megabytes instead of gigabytes, with low-powered CPUs that don't really suit garbage-collected memory management. I have absolutely no doubt that the insanely clever people who make things like .NET are going to fight back against this and make amazing things to make our lives easier again, but for a few years at least you're going to get to work on more fun, cutting-edge stuff if you're prepared to roll up your sleeves and at least learn the fundamental principles.
Oh, and Kestrel is amazing considering .NET doesn't really have an equivalent to Java's NIO.
P.S. @Chris: I know they're working on SSL in Kestrel, but you really want to use something like Nginx to do SSL termination and reverse-proxy to Kestrel. If you're running on Linux, do the reverse-proxy over a Unix Domain Socket for extra performance. IMHO, the biggest reason to get native TLS support in Kestrel is so it can do HTTP/2.
I remember Rovio's presentation on their hybrid (web/native) application where the Angry Birds used similar old-school optimization technique as Super Mario Bros. There were a movie theater curtains in the Angry Birds, but they send only one side of the curtain and flipped it just before rendering it to have both sides. It's nice to see that those old school techniques are still relevant today.
On the flip side though, I have thought about performance. I've thought about number of requests, payload size, etc when building web apps. I guess in a way I have thought about memory management, but just differently that folks used to have to think about it.
In my opinion, every programmer should at least know about the mechanism of memory management in the languages they're using. Also, I think, every programmer who wants to be a real programmer should know the C programming language.
It's analogous to the CPU, we don't think about how many cycles we're wasting, at least not like an Atari 2600 programmer did. CPU's are fast enough and we can worry about more important things.
If you're building a web server like Kestral or working on a search engine like Bing, then yes, it's a domain of knowledge which is required.
Writing LOB apps which take user input and process it with some business rules? Not something to be concerned about too much.
I liked the block structure of PL/I and how that helped make memory management easy, efficient for for the machine and easy for the programmer. Also important was that the exceptional condition handling walked back in the stack of dynamic descendancy, did the right things with memory along the way -- got rid of a lot of 'memory leaks' automatically -- not a pun. And the static memory was task-relative so that if a task went away, so did it's memory, also automatically. PL/I -- darned nice language. A George Radin victory.
Then on a team that wrote an AI language, I thought about memory management and gave up as I discovered that for essentially any (IIRC just any) scheme of memory management, there was a situation that made a mess out of memory, that is, left a lot of holes. So, since there can be no really perfect solution, I settled on the PL/I approach as good enough in practice.
In our program, IIRC we used Cartesian trees to help with memory management.
Currently popular languages that gave far too much weight to the design decision in C (IMHO a total disaster for computing), far behind Algol, Fortran, PL/I, Pascal, Ada, at least, but, sure, what the heck expect from a one-pass compiler on an 8 KB memory DEC PDP-8 or whatever that toy was?
We'd all be much better off with some block structure, scope of names, etc. -- heck, just borrow from PL/I. The emphasis on C instead of PL/I was a gigantic case of a whole industry shooting itself in the gut. Much worse -- C++. I never could get a clean definition of that language; I'm not sure Stroustrup has one either.
For more, the PL/I structures are so nice, far, far ahead of C, that OO programming is both much less efficient for both the machine and the programmer and also not much more useful for the programmer.
PL/I could be tweaked a little, e.g., have AVL trees built in and do more with the attribute LIKE.
Now, of course, sure, I just take the memory management in Microsoft's Visual Basic .NET. For the code for my Web site, VB seems fine.
C#? Reminds me of C so that I do an upchuck. I hate C. Deeply, profoundly, bitterly hate and despise C. The thing is like digging a grave with a teaspoon -- work like hell and then just fall in and die.
1. .NT should throw exception on deallocation of IDisposable by the GC if in debug mode
2. Adding more complexity to C# will make it as unwieldy as C++
3. Not flagging tools, frameworks, .net API calls, c# features as deprecated leads to failure
4. Lack of cross mobile development tools forcing use of lowest common denominator languages JS/HTML.
5. Mobile developed without regard to any longer term supportability
Write a novel as complex in plot and prose as can be; without regard for the next developer; and at the end of the day call it victory.
This is why I think new tools built from the ground up for this purpose, like Go, make a lot more sense. Native code is and probably will remain a must for a long time and it probably makes more sense for MS to create C# 7.0 as a new language with only the most basic and popularly used syntax features. Maintaining mostly C# syntax/semantics that matches Go's end goals. Yet still able to utilize .Net libraries if need be even if there's a performance hit (or compilation hit) for doing so.
Essentially take C# back to a small, stable core of a language that's purpose built for today's world, but retain the benefits of the wider .Net ecosystem.
I've even seen javascript programs which have manual memory allocators built in.
Most of those who haven't used one of the old languages have no idea of memory management at all.
Manual management is still needed sometimes, but the new languages don't allow to mix GC with it. I suspect it was an "ideological" point to use only GC in the new languages in the heat of the battle to spread them.
Now performance issues are probably less common, but more difficult to solve.
</ol>
The only real protection against oomkiller - other than turning it off through the overcommit settings - has been to configure a large swap space for the system so that there's something monitorable to send an alert (swap usage) before the oomkiller arises from the deep to sink some random ship and crew. However, in the modern glut of RAM, swap is something a lot of users and admins have stopped using. In elder days, the would-have been overcommitted RAM was just allocated on the swap partition, and if honestly never used, was never a problem. This was actually pretty important for large programs (bigger than RAM) that needed to fork to exec something small.
Every single piece of this horrible construction is terrible for user-facing programs, like workstation desktop users, who now have to fear that running anything intensive can cause the oomkiller to gun down just about anything - X, the window manager, the emacs with 40 unsaved files in it, that game where they're the tank for a 40-person RAID, etc.
On top of this, with all of these terrible libraries and programs which have now just stopped caring about memory management because they've deferred all the problem to oomkiller and service restarts, no program, no matter how well-written, can safely share a computer with them, because they are all *innately* irresponsible.
Memory overcommit is a poison.
- Linux is configured by default to lie about whether allocations succeed (see the vm.overcommit_memory and related variables - classical allocation can be restored, though!)
- With overcommit allowed, effectively malloc will lie about whether allocations succeed.
- With malloc lying, programs can't determine whether allocs succeeded (yep, past tense) until they *use* the memory, which results in the error of not being able to map an allocated page to memory. This is much harder to handle than the classic NULL returned by malloc, which could be handled locally and with knowledge of the semantic context of the allocation. This usually kills the program.
- Once a system has overcommitted memory, *any* use of memory pages by *any* program - not the just one that overallocated - raises the nasty version of the error. Hence any overcommit puts *all* programs that use malloc at risk. Even programs that handle memory responsibly.
- The hack that attempts to address this problem, the OOMkiller, does what it can to determine which program to kill to free up memory, but it is *impossible* to be rigorously correct in any choice. We see the results repeatedly in production environments where the oomkiller kills some program that was in the middle of doing something important, and usually not the one that was causing the problem. (I'm not blaming oomkiller, it's not the root of the problem)
- As a result, memory recovery has fallen back to just restarting the program that was rashly killed.
The only real protection against oomkiller - other than turning it off through the overcommit settings - has been to configure a large swap space for the system so that there's something monitorable to send an alert (swap usage) before the oomkiller arises from the deep to sink some random ship and crew. However, in the modern glut of RAM, swap is something a lot of users and admins have stopped using. In elder days, the would-have been overcommitted RAM was just allocated on the swap partition, and if honestly never used, was never a problem. This was actually pretty important for large programs (bigger than RAM) that needed to fork to exec something small.
Every single piece of this horrible construction is terrible for user-facing programs, like workstation desktop users, who now have to fear that running anything intensive can cause the oomkiller to gun down just about anything - X, the window manager, the emacs with 40 unsaved files in it, that game where they're the tank for a 40-person RAID, etc.
On top of this, with all of these terrible libraries and programs which have now just stopped caring about memory management because they've deferred all the problem to oomkiller and service restarts, no program, no matter how well-written, can safely share a computer with them, because they are all *innately* irresponsible.
Memory overcommit is a poison.
I frequently find such issues and wonder if the developers did ever measure startup performance. The biggest hurdle is to convince developers to admit that they are overusing some framework features and they should be looking for more adequate solutions to their actual problem. If you are really brave try to ask them to stop using that framework.
A small example: Someone did host Workflow Foundation and they wanted to store the WF configuration in a self defined config file. In order to do so the searched the internet and found that by setting up a new AppDomain you can set the app.config file name as you like. Problem solved. The only issue was that now every assembly was being JIT compiled again although there were NGen images around. That delayed startup by 10s. After deleting 5 lines of code to NOT create a new AppDomain and use simply the current executables app.config file the JIT costs went down from 13s to 0,7s and the startup time did decrease by 10s.
It is interesting that setting on the new AppDomain the LoaderOptimization attribute to MultiDomainHost did not help. As far as I remember from older .NET (<<4.6) versions this enforced the usage of the NGenned dlls. This seems to be no longer the case.
For those that adopted various self-hosting strategies over the years (Nancy, owin, WCF-SelfHost, WebApi etc), they would soon find out they had to think of these things as their process' LoH became all fragmented.
I'm happy there is focus on this.
Kestrel sounds promising. I must admit I am appalled when I look at how deep the stack on an MVC web app is by the time it reaches a controller action and how much memory a .NET application can rack up in general.
While I liked the WPF architecture I gave up on it because of its extravagant use of memory and overall poor rendering performance - text in particular.
I actually look forward to a day when .NET can produce small tight native assemblies with high performance and minimal dependencies (read that as low likelihood of problems when installed on a wide variety of machines). I guess when that happens MS will be finally able to port Office to C#.
Its way more productive to work in smaller teams doing sprints and pushing out new content.
Not to mention optimization of compilers make even scripting languages like javascript run pretty fast server side.
However, in the later years, languages like c++ has become more popular due to its speed and flexibility. Modern c++ differs from the olden days where you used new and delete yourself to allocate memory on the heap, but that does not mean the develloppers does not know whats happening in the background.
We are basicly seeing an optimization of process.
This goes back to the awesome podcast on abstraction.
http://www.hanselman.com/blog/ThisDevelopersLife106Abstraction.aspx
Coders think 'I could tweak that' while management think 'cost of coder doing that vs cost of them doing something else like adding in a new feature that may get us a sale to corporate xyz in that demo we have lined up in 3 months'.
It gets worse. That mindset teaches our latest crop of young, naive software engineers that the old ways are bad and these new ways are better. And, if all you're doing is CRUD, why not? But what we have is a generation of software engineers that couldn't possibly write its own tools -- how are you going to create a memory manager and garbage collector without pointers? I was always afraid that's what Visual Basic was going to get us, but I was wrong -- it's Java.
The irony of Java being primarily run on Linux just tickles me to no end. Linux, where every choice is made to improve performance, and Java, where performance takes a back seat to every other possible consideration. I suppose Java needs Linux for precisely that reason.
Anyway, this seems to be a common issue for people who went the Assembly-to-C route as opposed to starting from a 4GL and never working backward because you already incorporated the lie that lower levels are just too hard into your (now terribly limited) thinking processes. We grew up knowing what the system can do, and how to do it, and then got higher level languages built from a philosophy of "trust the programmer". Switching to a diametrically opposed viewpoint of "the programmer cannot be trusted" is possible but offensive.
Sorry, that doesn't read well to me. Here's my attempt at what I think you were trying to say:
"There's some amazing stuff going on in the code base with various micro-optimizations in memory management that works more intelligently."
Either that, or you used the wrong word and should have said "manages" instead of "management". Also, in any version it seems redundant to say micro-optimizations and more intelligently in the same sentence. Does anyone design micro-optimizations to work more stupidly?
I think what is happening is that early micro coders had to work with a "huge" memory size of 64KB when micro computers started showing up in the late 70's, word size was 8 bits and was a total joke to us mainframe coders. Today, my laptop has a 64 bit word size and it's memory size is 3GB and that's because I'm a cheapskate. I still try to figure out how much memory is too much by testing until my code blows.
There still isn't a lot of thought spent on memory and its benefits in good coding. Then again, a lot of people don't spend the time they should either because they don't have the experience to know there are other solutions and "maybe" they are just too lazy to look for another way.
All too often, its "Find a solution now." and thinking about design never happens.
1) When memory creates performance issues
2) When it costs you money (Azure and cloud services)
When else would you care?
Well, the languages that required using pointers correctly also required building at such a low level that re-inventing the wheel with each new version is necessary, there aren't any simple high-level design ideas that allow the developer to work on the idea and let the code work on the minutia. C# gained on C++ because C# can have a new version of an app written in a week and reviewed in a couple of days. When it is tested, there aren’t memory leaks, it generally does what it was supposed to, and it might run a few microseconds slower, but more importantly, it works and it easier to test if it does do what is supposed to. Nobody reads every other day about a new exploit against C# that invades or destroys code that had earlier passed tests. There are supposedly simple concepts like “string” that can manipulate characters way better than C++ can even think about without huge blocks of code.
“Memory leak” and “C#” almost never come in the same sentence. “Lost” pointers and “Memory leak” are usual companions.
If C++ had such a huge advantage over C#, the latter never would have caught on.
Same goes for deallocation - one might no longer need to free the memory manually, but their lifetime/validity is still an important factor in an algorithm.
Thus, most frequently mentioned C disadvantages aren´t solved by a less manual memory management:
-array indexing failures? Will still happen, happily throwing an exception messing up your intended behaviour. Or even worse, are indexing a wrong yet still valid element silently yielding wrong results
-null pointer access? Their counterpart is accessing a null reference, basically the same result (well, at least you usually get a nicer stack trace...)
-dangling pointers? Would be happily served by an object still surviving beyond its intended usage span. But doesn´t answer the main question - why is this object accessed that late at all? Chances are high that it contains bogus or outdated data, or is a hidden memory hog.
In short, automatic memory management may save some lines of code and hide some simple errors, but you still have to think and keep track about what´s going on almost exactly the same way. If automatic memory management saves a lot of time here it is more an indication of missing abstraction and structure in the way you want to solve a certain task rather than a disadvantage of certain tools and languages.
On the other hand, most environments with automatic memory cleanup still suffer from the most dreaded idea in language design:
Making references nullable and mutable by default, or not even providing non-nullable/non-mutable types at all. There´s nothing wrong with null oder side effects, but allowing them everywhere instead of only where explicitly stated is a way more severe source of bugs and waste of development time than memory management.
Interestingly enough i have been talking with climate scientists about how the methane release from the arctic has doomed mankind, obviously most coders have no idea about this due to abstraction, the same principle with memory management.
I wonder if in future coders will get automated emails from azure saying "your app may have memory management issues, we advise you run our special diagnostic program on your code..." ....
I need to suggest a global sensor network to climate scientists so we can run a diagnostic on earth.
Most samples on the internet ignore that. People copy and paste code.
Thats the main problem for us.
When will be the Web Server available for production ?
I don't worry about memory management, until I do. Usually when things are running within a single "machine" and things have to scale. The answer for me is Azure, Azure, Azure; scale horizontally to another container or VM if memory becomes a problem. Of course, then IO can become a problem...
To keep myself grounded, however, I do some spare-time development on classic game systems, where every byte is needed (machine I am working on now has 147 bytes of RAM).
It's taken many years for me to come to the conclusion that there is a trade off between fast and clean. I can have fast code, but it generally comes with a giant pile of brittle tests that are there for documentation. I can have clean, obvious, self-documenting code and I tend to get a small number of robust tests. I find that for the majority of the code I write these days there is a millisecond or less difference between fast and clean. However every project seems to have those one or two spots where crunching numbers makes the difference and I need to optimize for speed.
I can see that many devs these days don't care about memory management or writing efficient algorithms, but at the same time they legitimately can get away with it. The CLR and garbage collection is pretty smart about things. To some that's a crutch to others it's a tool, but in both cases business is getting done and clients are happy with the results.
If I optimize an algorithm to run 10ms faster, but it takes me an hour to do it, I've got to be sure that the algorithm is going to be used 360,000 before it's made my own time back. Or that it'll make my client back what I'm billing them for an hour of work. I'd love to optimize all my stuff, but sadly I can't do it that often without feeling like I'm wasting my clients money.
--
Sergiy Zinovyev
Some people say hardware is cheap, I can tell you it is not cheap when you need a large cluster of servers in a 2N+1 datacenter. The costs/savings between 120 servers and 12 servers is significant. Every 1ms shaved off a frequently used routine is £££ saved.
I never stopped caring about memory management in 20 years.
I guess the bottom line is to know *all* your tools: from the hardware to the language you're using, they all affect how you should write your code and it will perform.
In kernel space programming, memory footprint is a big deal. A byte taken will not be returned forcefully by the OS nor will the kernel terminate itself if it oversteps its memory bounds. What will happen is a crash.
Therefore, while user space programmers who work with GC can afford the luxury (and I say this with a grain of envy) of knowing that someone out there is managing their memory for them, the kernel space people still MUST know about memory management.
Huges leaks and high memory footprint mad worse by entengled code and abusive usage of mapping and bindings.
And then you have management counting the Jira bug counts and wondering why this one is still there week after week with pseudo-scrum one week delivery (that actually was the first cause of memory mismanagement, rushed deadlines).
And good luck if you think chrome tools will help you find easilly the cullprit(s) in your code. And then you have all those cool bower libs that where so easy to add. One of those library could be responsible for a major memory retention but your stuck with the old version because the framework developper have already have moved the other shinny new es6 thing...
Hint: If your application needs 1 gig of ram on desktop, your app not mobile ready.
So yeah, memory management is still relevant. Early optimisation is evil, not planning and managing a memory budget / target.
I cringe everytime someone talk about refactoring.
Memory management "problems" cannot be debugged - they must be designed-out ab initio.
Why? Because they reflect *architectural* failure, not implementation failure. The fundamental structural integrity is insufficient to support the system and successfully guide the construction, tools, and methods to a successful, robust implementation. Quality can only be built-in; it cannot be added-on.
Nothing makes a prefiction of disastrous problems faster than seeing malloc() and free() being called willy-nilly throughout code.
And note carefully that I am not talking about premature optimization. Doing a comprehensive soil study so the correct founcation can be selected and designed for a brand new building is not premature optimization. It is the difference between a chance at success and certain failure. Failure to do systems design commensurate with the requirements of a software system is likewise certain failure.
#Lawn?you Can'tHandleALawn
(grin)
With this information known, memory management becomes a largely formulaic exercise. Without it, you are still at risk of making semantic errors even if your language does memory management for you (for example, a mistake may result in having multiple objects, some of them in an outdated state, being used as if they all represented the same entity.)
I am not arguing that automatic memory management is a bad thing; I prefer to use it when I can. Even when manual memory management is formulaic, mistakes are easy to make. My point is that memory-management problems may be symptoms of a deeper failure of understanding, and in these cases, automatic memory management will not fix the underlying problem.
Anyway lets hope for best ....
In the end, it depends on the application, and for the most part you don't have to think about it until there are other problems... Following functional philosophies as much as possible, even in OO applications can save you a lot of headaches down the road. I tend to prefer simpler objects, and static utility classes that work with them in say C#/Java. This pattern tends to work out better in terms of application flow, growth, and memory management over smarter objects/classes.
I agree with Kamran. Writing webjobs in Azure on smaller boxes has made me refocus on memory. I love the SDK, but the cost is overhead for a continuous job.
1) Be correct
2) Be legible
3) Be performant
4) Be clever
Proper design should *allow* optimization when necessary. I don't typically worry about memory management because I don't typically have to. I'm more likely to run into a problem of maintainability if code was developed with an optimize-first mindset than to run into a problem of optimization if code was developed with a maintenance-first mindset.
The more you run into these types of problems, the more you are able to design your code to both be maintainable *and* avoid memory issues at the start, but if you're going to pick one, make it maintainable.
http://www.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf
The oversimplified takeaway is that garbage collection doesn't inherently cost time, but it does cost something like 2x memory overhead to achieve decent performance.
So the real questions developers need to ask themselves are: How memory hungry is my application? Can I move the data that's chewing up lots of memory into some kind of database that will store it more efficiently?
Don't concat string if I you don't have to.
As far as "how low should you go". much managed software that I have seen in big
companies leak ram badly, and here is the catch:
the more physical ram you have on a server, the tougher it may be because
GC will not kick-in fast enough (as there is tons of space free), when it does -
you are in for a 10 sec stall even with all BG server GC enabled.
We decided to go pretty low in managed CLR, ended -up doing "Big Memory", so now
we can easily keep 100,000,000 records using 32-64 Gb heaps easily forever
and GC is not suffering. Before that we tried to use "outofbox" stuff
but it is still slower than this:
http://www.infoq.com/articles/Big-Memory-Part-2
Comments are closed.