Missing XML comment for publicly visible type or member ‘considered harmful’

One of the nice things about .net is that you can automatically generate an .xml file for the xmldoc comments.

One of the worst things however is that by default, this leads to compiler warnings (and, in case “warnings as errors is enabled” – as it should be – leads to a failed compilation).

1>FooWrapper.cs(5,18,5,28): warning CS1591: Missing XML comment for publicly visible type or member 'FooWrapper'
1>FooWrapper.cs(7,21,7,24): warning CS1591: Missing XML comment for publicly visible type or member 'FooWrapper.Foo'
1>FooWrapper.cs(9,16,9,26): warning CS1591: Missing XML comment for publicly visible type or member 'FooWrapper.FooWrapper()'
1>FooWrapper.cs(14,16,14,26): warning CS1591: Missing XML comment for publicly visible type or member 'FooWrapper.FooWrapper(bool)'
1>FooWrapper.cs(19,32,19,39): warning CS1591: Missing XML comment for publicly visible type or member 'FooWrapper.Dispose(bool)'
1>FooWrapper.cs(23,21,23,28): warning CS1591: Missing XML comment for publicly visible type or member 'FooWrapper.Dispose()'

This often leads to the desire to add comments to everything, possibly even using automated tools, which results in a class like this:

/// <summary>
/// A Class to wrap a Foo value.
/// </summary>
public class FooWrapper: IDisposable
{
    /// <summary>
    /// The wrapped Foo value
    /// </summary>
    public bool Foo { get; }

    /// <summary>
    /// Initializes a new instance of the <see cref="FooWrapper"/> class.
    /// </summary>
    public FooWrapper()
    {
    }

    /// <summary>
    /// Initializes a new instance of the <see cref="FooWrapper"/> class,
    /// with the given value for foo.
    /// </summary>
    public FooWrapper(bool foo)
    {
        Foo = foo;
    }

    /// <summary>
    /// Releases unmanaged and - optionally - managed resources.
    /// </summary>
    /// <param name="disposing">
    ///     <c>true</c> to release both managed and unmanaged resources;
    ///     <c>false</c> to release only unmanaged resources.
    /// </param>
    protected virtual void Dispose(bool disposing)
    {
    }

    /// <summary>
    /// Performs application-defined tasks associated with freeing,
    /// releasing, or resetting unmanaged resources.
    /// </summary>
    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }
}

What’s wrong with this class? The signal-to-noise ratio is atrocious, and I consider this downright harmful to understanding what the class does, and of course the comments get outdated even quicker the more there are. Let’s break it down into the useful and useless:

FooWrapper: A Class to wrap a Foo value.

Potentially useful. This tells me what the class is meant for, but sane naming of the class already does that. It could be more useful to explain why Foo needs to be wrapped and when I should use this instead of just passing around the Foo value directly, and when to subclass it.

Foo: The wrapped Foo value

Useless. I know it’s a wrapped Foo value because it’s a property named Foo in a class named FooWrapper. What could make this useful is by explaining what this Foo value represents, and what I would use it for.

FooWrapper: Initializes a new instance of the <see cref=”FooWrapper”/> class.

Useless. I know that it initializes a new instance of the FooWrapper class, because it’s a constructor of the FooWrapper class. That’s what constructors do, they initialize new instances of the class they are part of. There is no other information conveyed here – no information about potential side-effects, about valid input arguments, about potential Exceptions, nothing.

The overload that tells me that the bool foo argument will initialize Foo to the given foo is also useless, because – well, duh, what else is it’s going to do?

Dispose: Releases resources

Useless. IDisposable is a fundamental language feature, so both the reason for this method and the Dispose pattern are well known. What isn’t known is if there’s anything noteworthy – does it dispose any values that were passed into the constructor? (Important e.g., when passing Streams around – whose job is it to close/dispose the stream in the end?). Are there negative side effects if NOT disposing in time?

Useful comments

Now, this class is arguably a very simplistic example. But that makes it also a very good example, because many applications and libraries contain tons of these simple classes. And many times, it feels that they are commented like this out of Malicious Compliance in order to shut the compiler warnings up or fulfill some “All Code must be documented” rule.

The real solution is to suppress the 1591 warning and only add comments to code that do something non-obvious or critical to pay attention to. In the case of the above example class, the best I can come up with is below.

/// <summary>
/// This class wraps a Foo value, captured
/// from when the operation was started.
///
/// Operations that need to capture additional values
/// should derive from this to add their own additional
/// values.
/// </summary>
public class FooWrapper : IDisposable
{
    /// <summary>
    /// The Foo that was wrapped at the beginning of the operation.
    /// Changes to the Foo value in the holder class do not change this value.
    /// </summary>
    public bool Foo { get; }

    public FooWrapper()
    {

    }

    public FooWrapper(bool foo)
    {
        Foo = foo;
    }

    /// <summary>
    /// This class implements IDisposable to allow
    /// derived classes to capture values that need to be
    /// disposed when the operation is finished.
    /// </summary>
    protected virtual void Dispose(bool disposing)
    {
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }
}

Now, the comments convey useful information: We learn the intent of the class – that’s something not obvious from the code. Though arguably, this class should now be called InitialOperationState or something like that. It also explains why/when to create subclasses for it. The comment on the property now explains something about the purpose, rather than just reiterating the code in prose. And finally, the Dispose(bool) method explains why it’s there. The constructors and Dispose() methods do not need any comments – they don’t do anything worth commenting.

And because I suppressed 1591, the compiler is happy as well.

Accessing LDAP Directory Services in .NET Core

The .NET Framework has had support for LDAP through the System.DirectoryServices Namespaces since forever. This has been a P/Invoke into wldap32.dll, which limited the ability for developers to troubleshoot issues and wasn’t platform-independent. With the advent of .NET Core and the desire to run applications on Linux or macOS, the lack of LDAP Support has been an issue.

In the JAVA World, it’s normal to have fully managed libraries in lieu of platform-limited wrappers, and LDAP is no Exception. These days, the Apache Directory LDAP API™ looks like the go-to, but way back in the day, Novell also had an LDAP Client. This was eventually donated to the OpenLDAP project and lives in the JLDAP tree, although development has long since stopped. Back in the day, Novell used to own Mono, and during that time they made a C# conversion of their LDAP Client. The code was clearly ran through an automated JAVA-to-C# converter, but it offered a fully managed way to access LDAP.

While that C# code had lain dormant since the initial release in 2006, .NET Core offered a new incentive to revisit it. dsbenghe made a conversion of the code to support .NET Standard 1.3/2.0, which lives at https://github.com/dsbenghe/Novell.Directory.Ldap.NETStandard and is available on Nuget as Novell.Directory.Ldap.NETStandard.

Over the past couple of weeks, I’ve made some contributions as well, mainly to add support for SASL Authentication, which is available since Version 3.0.0-beta4. At this time, only the CRAM-MD5, DIGEST-MD5 and PLAIN mechanisms are available, but this offers the foundation to connect to a wider range of directories in case Simple LDAP Bind isn’t an option.

An example of how to connect using DIGEST-MD5 an LDAP Directory (in this case, Active Directory):

var ADHost = "mydc.example.com";
var saslRequest = new SaslDigestMd5Request("Username", "Password", "Domain", ADHost);

using (var conn = new LdapConnection())
{
    try
    {
        conn.Connect(ADHost, 389);
        conn.StartTls();
        conn.Bind(saslRequest);
        Console.WriteLine($"[{conn.AuthenticationMethod}] {conn.AuthenticationDn}");
    }
    finally
    {
        if (conn.Tls)
        {
            conn.StopTls();
        }
    }
    
}

Now, whether this is preferable over simple bind is up for discussion – the fact that DIGEST-MD5 requires the domain controller to store the password with reversible encryption is certainly a potential issue. But on the other hand, if you cannot guarantee the security of the transport, DIGEST-MD5 at least means your password will never have to be sent over the wire.

Ultimately, support for the SASL EXTERNAL mechanism with Client Certificates and support for Kerberos will offer modern security/authentication mechanisms. But the bottom line is that there is now a 100% managed LDAP Client for .net that’s in active development. One that is supposed to support any LDAP Server instead of focusing mainly on Active Directory, but one that will offer first class Active Directory support as well. For Stack Overflow Enterprise, we made first class LDAP Authentication support a big goal for the future. We want to support as many real-world environments as possible, and we want everything to work on .NET Core as well. There’s still plenty of work to do, but I’m happy that this project exists.

PicSol – a .net Nonogram/Picross Solver Library

Nonograms – also known as Griddlers, Picture Crosswords, or Picross – are pretty cool puzzles, kind of like a more visual Crossword puzzle or Sudoku. Of all the games on my New 2DS XL, Mario’s Picross and the Picross e series are near the top of my Activity Log (beaten only by Smash Bros).

I got curious about algorithmic solutions to those Nonograms, which seems deceptively easy, but is actually NP-complete. When trying to solve a Nonogram, often I can to only fill in one or a few cells of a group, which then leads to another cell that can be filled in (or X-ed out), and step by step, cell by cell, I solve the Nonogram. Now, that assumes that the Nonogram is properly designed – if that’s the case, then there is always at least one cell that either must definitely be filled or definitely be empty.

All of Jupiter’s games are well designed – even the most tricky ones (with a bunch of 1’s and 2’s and no big numbers) always follow the mantra of There’s always at least one cell that has a definitive solution. There are a lot of other games on the market (Steam returns about 15 games when searching for Picross or Nonogram), and some are not well designed and actually require guessing.

I ended up (after a bunch of googling approaches and other existing solvers) with a solution that’s mostly brute force – generate all possibilities for a row and column, then eliminate those that can’t be correct, rinse and repeat until there’s only 1 possibility left for each row and column, or until we determined that the Nonogram is actually unsolvable. There are some shortcuts that we can take, e.g, when a row/column is empty, completely filled, or completely filled with gaps in-between them.

I’ve created PicSol, a library for .net Standard 2.0 and .net Framework 4.0 (or newer) and available on Nuget which offers a Solver for Nonograms.

Check out the README for information on how to use it, or look at the Console project in the GitHub repository.



Using .net Framework sources for better debugging

Over the last couple of weeks, we’ve been working on changes to our SAML 2.0 Authentication on Stack Overflow Enterprise. This was required to better support scenarios around signing and encrypting SAML Requests and Responses, and as such, a lot of the work was centered around XML Security, specifically the SignedXml and EncryptedXml classes.

Now, one thing I can say for sure about SAML 2.0 is that every Identity Provider implements it slightly differently, causing XML Validation errors or CryptographicExceptions somewhere deep in the code. So, how can we properly debug and fix this?

The first option is to Enable Symbol Server support in Visual Studio. This gives you .pdbs for .net Framework code, but because the Framework is compiled in Release mode, some code is inlined or otherwise rewritten to no longer match the exact source, which makes following variables and even call stacks really hard the deeper you go.

Another option is to check out the .net Framework Reference Source, also available on the easy to remember http://sourceof.net. This is the actual source code of the .net Framework, which allows at least reading through it to see what’s actually going on, at least until you hit any native/external code. You can even download the sources, which not only allows you to view it in Visual Studio, but it also allows you to compare implementations across Framework versions to see if anything changed. (The code on the website is always only for the latest Framework, which is 4.7.1 at the time of writing. If I need to see how something was implemented in 4.6.2, I need to download the sources)

Another thing that we can do with the reference sources is to put them into our project, change namespaces to avoid ambiguity, and then use our private copies of the implementation.

This is a huge help for several reasons:

  • We can step through actual code, no compiler symbols or weird abstractions, it’s actual code.
  • Since we can compile in Debug mode, we don’t have to worry about optimizations making the debug experience much harder.
  • Breakpoints work properly, including conditional ones.
  • If we have a hypothesis about a cause of or fix for an issue, we can make source code changes to verify.

Now, there are two problems with Framework Sources. First off, because Framework code can refer to protected and internal methods, we might have to either copy a massive amount of supporting code, or implement workarounds to call those methods that are inaccessible to us (there are 79 files in my private copy of SignedXml/EncryptedXml). But the real showstopper is that the license doesn’t allow us to ship Framework code, as it’s only licensed for reference use. So if I found a way to fix an issue, I need to see how I can make this work on our side of the code, using the actual .net Framework classes.

Now, if we really don’t find a way to solve this issue without needing to modify Framework code, a possible option are the .NET Core Libraries (CoreFX) sources, because that code is MIT Licensed. It’s a subset of the .net Framework, but due to the license, anything that’s there can be used, modified, and shipped by us. This is a bit of a last resort, but can be preferable to worse workarounds. I can not understate how awesome it is that Microsoft releases so much code under such a permissible license. It not only makes our life so much easier, but in turn it benefits our customers by providing a better product.

In the end, we could resolve all issues we’ve seen without having to modify framework code after why understood exactly where (and why!) something was failing, and our SAML 2.0 implementation got a whole lot better because of the availability of the source code.

Simplexcel 2.0.5

It’s been a few month since I released Simplexcel 2.0.0, which was a major change in that it added .net Standard support, and can be used on .net Core, incl. ASP.net Core.

Since then, there have been a few further feature updates:

  • Add Worksheet.Populate<T> method to fill a sheet with data. Caveats: Does not loot at inherited members, doesn’t look at complex types.
  • Also add static Worksheet.FromData<T> method to create and populate the sheet in one.
  • Support for freezing panes. Right now, this is being kept simple: call either Worksheet.FreezeTopRow or Worksheet.FreezeLeftColumn to freeze either the first row (1) or the leftmost column (A).
  • If a Stream is not seekable (e.g., HttpContext.Response.OutputStream), Simplexcel automatically creates a temporary MemoryStream as an intermediate.
  • Add Cell.FromObject to make Cell creation easier by guessing the correct type
  • Support DateTime cells
  • Add support for manual page breaks. Call Worksheet.InsertManualPageBreakAfterRow or Worksheet.InsertManualPageBreakAfterColumn with either the zero-based index of the row/column after which to create the break, or with a cell address (e.g., B5) to create the break below or to the left of that cell.

Simplexcel is available on Nuget, and the source is on GitHub.

TresorLib – a deterministic password generator library for .net

Tresor is a .net library (.net 4/netstandard 1.3 or newer) to generate passwords in a deterministic way, that is, the same inputs will yield the same outputs, but from simple inputs (Like the service name twitter and the passphrase I'm the best 17-year old ever.) you will get a strong password like c<q_!^~,'.KTbPV=9^mU.

This helps with the behavior of using the same passphrase for multiple sites, but is also not reliant on a password vault that you don’t always tend to have with you (e.g., because it’s a file on your computer and you’re on the road) or requires you to trust a cloud-service to securely keep your data.

Internet Websites get compromised all the time – so frequently that haveibeenpwned.com has been created for people to get notified whenever their email address shows up in a data breach. Troy Hunt has an amazing blog about web security issues and breaches, and explains why you should use a Password Manager.

Tresor is a port of Vault by James Coglan to .net. When I first discovered Vault, it immediately checked the boxes I was looking for: It works everywhere as long as I have internet access, but it doesn’t store any data that can be compromised. All I need to do is remember the service name and settings, and voila, there’s my password without having to store it in yet another online database that can be compromised.

For more information about Vault, check the FAQ on https://getvau.lt/faq.html.

Please note that this port is not endorsed or in any way associated with James Coglan.

TresorLib has been published to Nuget, and the source code is available on GitHub. It is licensed under the GNU General Public License Version 3 (GPLv3).

Example usage:

var service = "twitter";
var phrase = "I'm the best 17-year old ever.";
var password = Tresor.GeneratePassword(service, phrase, TresorConfig.Default);

// password => c;q- q}+&,KTbPVn9]mh

For more usage information and documentation, check the Readme in the GitHub repository.

Simplexcel 2.0.0

A couple of years ago, I created a simple .net Library to create Excel .xlsx sheets without the need to do any COM Interop or similar nonsense, so that it can be used on a web server.

I just pushed a major new version, Simplexcel 2.0.0 to NuGet. This is now targeting both .net Framework 4.5+ and .NET Standard 1.3+, which means it can now also be used in Cross-Platform applications, or ASP.net Core.

There are a few breaking changes, most notably the new Simplexcel.Color struct that is used instead of System.Drawing.Color and the change of CompressionLevel from an enum to a bool, but in general, this should be a very simple upgrade. Unless you still need to target .net Framework 4 instead of 4.5+, stay on Version 1.0.5 for that.

RAM and CPU Cycles are there to be used.

What is wrong with this picture?

Answer: The white area in the top of the CPU and Memory graphs. These indicate that I spent money on something that I’m not using.

One of the complaints that are often brought forward how certain applications (especially browsers) are “memory hogs”. As I’m writing this, Chrome uses 238.1 MB of RAM, and a separate Opera uses 129.8 MB. Oh my, remember when 4 MB were enough to run an entire operating system?

Now, here’s the thing about RAM and CPU Cycles: I spend my (or someone elses) hard earned cash on it in order to speed up my computer use. That’s literally why it exists – to make stuff go faster, always.

Having 32 GB of RAM cost about $200. In the above screenshot, about $135 of those hard earned dollars are just doing noting. It’s like hiring (and paying) an employee full-time and only giving them 3 hours of work every day. That CPU? It’s Quad-Core, Eight Thread, with 2.6 Billion Cycles per second – that’s between 10 and 20 Billion total cycles each second. And yet, it’s sitting at 16% in that screenshot. For a CPU that’s priced at $378, that’s $317 not being used.

There are priorities and trade-offs that everyone needs to make. In a Laptop, maybe you don’t want the CPU to be constantly close to 100%, because it drains the battery and the whirring cooling fan is annoying. But maybe you bought such a high end laptop specifically because you want to use that much CPU power and RAM. I certainly did.

My primary application is Visual Studio, which has been making some really good strides in 2017 to be faster. Find all References is pretty fast, Go To Types is pretty fast. Just “Find in Files” could be faster because it still seems to hit the disk. The cost for that? Currently 520 MB RAM usage. I’ll take that. In fact, if I could more speed at the expense of more RAM, I’d take that as well. In fact, I would love for Visual Studio to find a way to reduce the 45 Second build time – as you see in the graph, the CPU only briefly spikes. Why is it not constant 100% when I click the Build button? Is there a way to just have everything needed to compile constantly in RAM? (and yes, I know about RAM disks, and I do have a SSD that does 3 GB/s – but the point is for applications to be more greedy)

Two applications that I run that really make great use of my Computer are SQL Server and Hyper-V. SQL Server is currently sitting at 3.5 GB and will grow to whatever it needs. And Hyper-V will use whatever it needs as well. Both application also do respect my limits if I set them.

But they don’t prematurely limit themselves. Some people are complaining about Spotify’s memory usage. Is that too much for a media player? Depends. I’m not using Spotify, but I use iTunes. Sometimes I just want to play a specific song or Album, or just browse an Artist to find something I’m in the mood for. Have you ever had an application where you scroll a long list and halfway through it lags because it has to load more data? Or where you search/filter and it takes a bit to display the results? I find that infuriating. My music library is only about ~16000 tracks – can I please trade some RAM to make the times that I do interact with it as quick as possible? YMMV, but for me, spending 500 MB on a background app for it to be super-responsive every time I interact with it would be a great tradeoff. Maybe for you, it’s different, but for me, iTunes does stay fast at the expense of my computer resources.

Yeah, some apps may take that too far, or do wrong behaviors like trashing your SSD. Some apps use too much RAM because they’re coded inefficiently, or because there is an actual bug. It should always be a goal to reduce resource usage as much as possible.

But that should just be one of the goals. Another goal should be to maximize performance and productivity. And when your application sees a machine with 8, 16 or even 32 GB RAM, it should maybe ask itself if it should just use some of that for productivity reasons. I’d certainly be willing to trade some of that white space in my task manager for productivity. And when I do need it for Hyper-V or SQL Server, then other apps can start treating RAM like it’s some sort of scarce resource. Or when I want to be in battery-saver mode, prioritizing 8 hours of slow work over 4 hours of fast work.

But right now, giving a couple of hundreds of megs to my Web Browsers and Productivity Apps is a great investment.

git rebase on pull, and squash on merge

Here’s a few git settings that I prefer to set, now that I actually work on projects with other people and care for a useful history.

git config --global branch.master.mergeoptions "--squash"
This always squashes merges into master. I think that work big enough to require multiple commits should be done in a branch, then squash-merged into master. That way, master becomes a great overview of individual features rather than the nitty-gritty of all the back and forth of a feature.

git config --global pull.rebase true
This always rebases your local commits, or, in plain english: It always puts your local, unpushed commits to the top of the history when you pull changes from the server. I find this useful because if I’m working on something for a while, I can regularly pull in other people’s changes without fracturing my history. Yes, this is history-rewriting, but I care more for a useful rather than a “pure” history.

Combined with git repository hosting (GitLab, GitHub Enterprise, etc.), I found that browsing history is a really useful tool to keep up code changes (especially across timezones), provided that the history is actually useful.

When nslookup works but you can’t ping it, NetBIOS may be missing

I have a custom DNS Server (running Linux) and I also have a server running Linux (“MyServer”). The DNS Server has an entry in /etc/hosts for MyServer.

On my Windows machines, I can nslookup MyServer and get the IP back, but when I try to access the machine through ping or any of the services it offers, the name doesn’t resolve. Access via the IP Address works fine though.

What’s interesting is that if I add a dot at the end (ping MyServer.) then it suddenly works. What’s happening?!

What’s happening is that Windows doesn’t use DNS but NetBIOS for simple name resolution. nslookup talks to the DNS Server, but anything else doesn’t use DNS.

The trick was to install Samba on MyServer, because it includes a NetBIOS Server (nmbd). On Ubuntu 16.04, just running sudo apt-get install Samba installs and auto-starts the service, and from that moment on my Windows machines could access it without issue.

There are ways to not use NetBIOS, but I didn’t want to make changes on every Windows client (since I’m using a Domain), so this was the simplest solution I could find. I still needed entries in my DNS Server so that Mac OS X can resolve it.