Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Chocolatey Community Feed State of the Union

tl;dr: Everything on https://chocolatey.org/notice is coming to fruition! We’ve automatically tested over 6,500 packages, a validator service is coming up now to check quality and the unreviewed backlog has been reduced by 1,000 packages! We sincerely hope that the current maintainers who have been waiting weeks and months to get something reviewed can be understanding that we’ve dug ourselves into a moderation mess and are currently finding our way out of this situation.

Notice on Chocolatey.org
We’ve added a few things to Chocolatey.org (the community feed) to help speed up review times for package maintainers. A little over a year ago we introduced moderation for all new package versions (besides trusted packages) and from the user perspective it has been a fantastic addition. The usage has went up by over 20 million packages installed in one year versus just 5 million the 3 years before it! It’s been an overwhelming response for the user community. Let me say that again for effect: Chocolatey’s usage of community packages has increased 400% in one year over the prior three years combined!

But let’s be honest, we’ve nearly failed in another area. Keeping the moderation backlog low. We introduced moderation as a security measure for Chocolatey’s community feed because it was necessary, but we introduced it too early. We didn’t have the infrastructure automation in place to handle the sheer load of packages that were suddenly thrown at us. And once we put moderation in place, more folks wanted to use Chocolatey so it suddenly became much more popular. And because we have automation surrounding updating and pushing packages (namely automatic packages), we had some folks who would submit 50+ packages at a time. With one particular maintainer submitting 200 packages automatically, and a review of each of them taking somewhere between 2-10 minutes, you don’t have to be a detective to understand how this is going to become a consternation. And from the backlog you can see it really hasn’t worked out well.

1597 submitted

The most important number to understand here is the number in the submitted (underlined). This is the number of packages where a moderator has not yet looked at a package. A goal is to keep this well under 100. We want that time from a high quality package getting submitted to approved within 1-2 days.

Moderation has up until recently been a very manual process. Sometimes depending on which moderator that looked at your package determined whether it was going to be held in review for various reasons. We’ve added moderators and we’ve added more guidance around moderation to help bring a more structured review process. But it’s not enough.

Some of you may not know this, but our moderators are volunteers and we currently lack full-time employees to help fix many of the underlying issues. Even considering that we’ve also needed to work towards Kickstarter delivery and the Chocolatey rewrite (making choco better for the long term), it’s still not the greatest news to know that it has taken a long time to fix moderation, but hopefully it brings some understanding. Our goal is to eventually bring on full-time employees but we are not there yet. The Kickstarter was a start, but it was just that. A kick start. A few members of the core team who are also moderators have focused on ensuring the Kickstarter turns into a model that can ensure the longevity of Chocolatey. It may have felt that we have been ignoring the needs of the community, but that has not been our intention at all. It’s just been really busy and we needed to address multiple areas surrounding Chocolatey with a small number of volunteers.

So What Have We Fixed?

All moderation review communication is done on the package page. Now all review is done on the website, which means that there is no email back and forth (the older process) and what looks like one-sided communication on the site. This is a significant improvement.

Package review logging. Now you can see right from the discussion when and who submits package, when statuses change and where the conversation is.

package review logging

More moderators. A question that comes up quite a bit surrounds the number of moderators that we have and adding more. We have added more moderators. We are up to 12 moderators for the site. Moderators are chosen based on building trust, usually through being extremely familiar with Chocolatey packaging and what is expected of approved packages. Learning what is expected usually comes through having your own packages approved and having a few packages. We’ve written most of this up at https://github.com/chocolatey/choco/wiki/Moderation.

Maintainers can self-reject packages that no longer apply. Say your package has a download url for the software that is always the same. You have some older package versions that could take advantage of being purged out of the queue since they are no longer applicable.

The package validation service (the validator). The validator checks the quality of a package based on requirements, guidelines and suggestions for creating packages for Chocolatey’s community feed. Many of the validation items will automatically roll back into choco and will be displayed when packaging a package. We like to think of the validator as unit testing. It is validating that everything is as it should be and meets the minimum requirements for a package on the community feed.

validation results

The package verifier service (the verifier). The verifier checks the correctness (that the package actually works), that it installs and uninstalls correctly, has the right dependencies to ensure it is installed properly and can be installed silently. The verifier runs against both submitted packages and existing packages (checking every two weeks that a package can still install and sending notice when it fails). We like to think of the verifier as integration testing. It’s testing all the parts and ensuring everything is good. On the site, you can see the current status of a package based on a little colored ball next to the title. If the ball is green or red, the ball is a link to the results (only on the package page, not in the list screen).

passed verification - green colored ball with link

  • Green means good. The ball is a link to the results
  • Orange if still pending verification (has not yet run).
  • Red means it failed verification for some reason. The ball is a link to the results.
  • Grey means unknown or excluded from verification (if excluded, a reason will be listed on the package page).

Coming Soon – Moderators will be automatically be assigned to backlog items. Once a package passes both validation and verification, a moderator is automatically assigned to review the package. Once the backlog is in a manageable state, this will be added.

What About Maintainer Drift?

Many maintainers come in to help out at different times in their lives and they do it nearly always as volunteers. Sometimes it is the tools they are using at the current time and sometimes it has to do with where they work. Over time folks’ preferences/workplaces change and so maintainers drift away from keeping packages up to date because they have no internal incentive to continue to maintain those packages. It’s a natural human response. I’ve been thinking about ways to reduce maintainer drift for the last three years and I keep coming back to the idea that consumers of those packages could come along and provide a one time or weekly tip to the maintainer(s) as a thank you for keeping package(s) updated. We are talking to Gratipay now – https://github.com/gratipay/inside.gratipay.com/issues/441 This, in addition to a reputation system, I feel will go a long way to help reduce maintainer drift.

Final Thoughts

Package moderation review time is down to mere seconds as opposed to minutes like before. This will allow a moderator to review and approve package versions much more quickly and will reduce our backlog and keep it lower.

It’s already working! The number in the unreviewed backlog are down by 1,000 from the month prior. This is because a moderator doesn’t have to wait until a proper time when they can have a machine up and ready for testing and in the right state. Now packages can be reviewed faster. This is only with the verifier in place, sheerly testing package installs. The validator expects to cut that down to near seconds of review time. The total number of packages in the moderation backlog have also been reduced, but honestly I only usually pay attention to the unreviewed backlog number as it is the most important metric for me.

The verifier has rolled through over 6,500 verifications to date! https://gist.github.com/choco-bot/

When chocobot hit 6500 packages verified

We sincerely hope that the current maintainers who have been waiting weeks and months to get something reviewed can be understanding that we’ve dug ourselves into a moderation mess and are currently finding our way out of this situation. We may have some required findings and will ask for those things to be fixed, but for anything that doesn’t have required findings, we will approve them as we get to them.

Posted in chocolatey | 1 Comment

Testing with Data

It’s not a coincidence that this is coming off the heels of Dave Paquette’s post on GenFu and Simon Timms’ post on source control for databases in the same way it was probably not a coincidence that Hollywood released three body-swapping movies in the 1987-1988 period (four if you include Big).

I was asked recently for some advice on generating data for use with integration and UI tests. I already have some ideas but asked the rest of the Western Devs for some elucidation. My tl;dr version is the same as what I mentioned in our discussion on UI testing: it’s hard. But manageable. Probably.

The solution needs to balance a few factors:

  • Each test must start from a predictable state
  • Creating that predictable state should be fast as possible
  • Developers should be able to figure out what is going on by reading the test

The two options we discussed both assume the first factor to be immutable. That means you either clean up after yourself when the test is finished or you wipe out the database and start from scratch with each test. Cleaning up after yourself might be faster but has more moving parts. Cleaning up might mean different things depending on which step you’re in if the test fails.

So given that we will likely re-create the database from scratch before each and every test, there are two options. My current favourite solution is a hybrid of the two.

Maintain a database of known data

In this option, you have a pre-configured database. Maybe it’s a SQL Server .bak file that you restore before each test. Maybe it’s a GenerateDatabase method that you execute. I’ve done the latter on a Google App Engine project, and it works reasonably well from an implementation perspective. We had a class for each domain aggregate and used dependency injection. So adding a new test customer to accommodate a new scenario was fairly simple. There are a number of other ways you can do it, some of which Simon touched on in his post.

We also had it set up so that we could create only the customer we needed for that particular test if we needed to. That way, we could use a step likeGiven I'm logged into 'Christmas Town' and it would set up only that data.

There are some drawbacks to this approach. You still need to create a new class for a new customer if you need to do something out of the ordinary. And if you need to do something only slightly out of the ordinary, there’s a strong tendency to use an existing customer and tweak its data ever so slightly to fit your test’s needs, other tests be damned. With these tests falling firmly in the long-running category, you don’t always find out the effects of this until much later.

Another drawback: it’s not obvious in the test exactly what data you need for that specific test. You can accommodate this somewhat just with a naming convention. For example,Given I'm logged into a company from India, if you’re testing how the app works with rupees. But that’s not always practical. Which leads us to the second option.

Create an API to set up the data the way you want

Here, your API contains steps to fully configure your database exactly the way you want. For example:

Given I have a company named "Christmas Town" owned by "Jack Skellington"
And I have 5 product categories
And I have 30 products
And I have a customer
...

 

You can probably see the major drawback already. This can become very verbose. But on the other hand, you have the advantage of seeing exactly what data is included which is helpful when debugging. If your test data is wrong, you don’t need to go mucking about in your source code to fix it. Just update the test and you’re done.

Also note the lack of specifics in the steps. Whenever possible, I like to be very vague when setting up my test data. If you have a good framework for generating test data, this isn’t hard to do. And it helps uncover issues you may not account for using hard-coded data (as anyone named D’Arcy O’Toole can probably tell you).


Loading up your data with a granular API isn’t realistic which is why I like the hybrid solution. By default, you pre-load your database with some common data, like lookup tables with lists of countries, currencies, product categories, etc. Stuff that needs to be in place for the majority of your tests.

After that, your API doesn’t need to be that granular. You can use something likeGiven I have a basic company which will create the company, add an owner and maybe some products and use that to test the process for creating an order. Under the hood, it will probably use the specific steps.

One reason I like this approach: it hides only the details you don’t care about. When you sayGiven I have a basic company and I change the name to "Rick's Place", that tells me, “I don’t care how the company is set up but the company name is important”. Very useful to help narrow the focus of the test when you’re reading it.

This approach will understandably lead to a whole bunch of different methods for creating data of various sizes and coarseness. And for that you’ll need to…

Maintain test data

Regardless of your method, maintaining your test data will require constant vigilance. In my experience, there is a tremendous urge to take shortcuts when it comes to test data. You’ll re-use a test company that doesn’t quite fit your scenario. You’ll alter your test to fit the data rather than the other way around. You’ll duplicate a data setup step because your API isn’t discoverable.

Make no mistake, maintaining test data is work. It should be treated with the same respect and care as the rest of your code. Possibly more so since the underlying code (in whatever form it takes) technically won’t be tested. Shortcuts and bad practices should not be tolerated and let go because “it’s just test data”. Fight the urge to let things slide. Call it out as soon as you see it. Refactor mercilessly once you see opportunities to do so.

Don’t be afraid to flip over a table or two to get your point across.

– Kyle the Unmaintainable

Posted in Uncategorized | 3 Comments

Running a .NET app against a Postgres database in Docker

Some days/weeks/time ago, I did a presentation at MeasureUP called “Docker For People Who Think Docker Is This Weird Linux Thing That Doesn’t Impact Me”. The slides for that presentation can be found here and the sample application here.

Using the sample app with PostgreSQL

The sample application is just a plain ol’ .NET application. It is meant to showcase different ways of doing things. One of those things is data access. You can configure the app to access the data from SQL storage, Azure table storage, or in-memory. By default, it uses the in-memory option so you can clone the app and launch it immediately just to see how it works.

PancakeProwler

Quick summary: Calgary, Alberta hosts an annual event called the Calgary Stampede. One of the highlights of the 10-ish day event is the pancake breakfast, whereby dozens/hundreds of businesses offer up pancakes to people who want to eat like the pioneers did, assuming the pioneers had pancake grills the size of an Olympic swimming pool.

The sample app gives you a way to enter these pancake breakfast events and each day, will show that day’s breakfasts on a map. There’s also a recipe section to share pancake recipes but we won’t be using that here.

To work with Docker we need to set the app up to use a data access mechanism that will work on Docker. The sample app supports Postgres so that will be our database of choice. Our first step is to get the app up and running locally with Postgres without Docker. So, assuming you have Postgres installed, find the ContainerBuilder.cs file in the PancakeProwler.Web project. In this file, comment out the following near the top of the file:

// Uncomment for InMemory Storage
builder.RegisterAssemblyTypes(typeof(Data.InMemory.Repositories.RecipeRepository).Assembly)
       .AsImplementedInterfaces()
       .SingleInstance();

And uncomment the following later on:

// Uncomment for PostgreSQL storage
builder.RegisterAssemblyTypes(typeof(PancakeProwler.Data.Postgres.IPostgresRepository).Assembly)
    .AsImplementedInterfaces().InstancePerRequest().PropertiesAutowired();

This configures the application to use Postgres. You’ll also need to do a couple of more tasks:

  • Create a user in Postgres
  • Create a Pancakes database in Postgres
  • Update the Postgres connection string in the web project’s web.config to match the username and database you created

The first two steps can be accomplished with the following script in Postgres:

CREATE DATABASE "Pancakes";

CREATE USER "Matt" WITH PASSWORD 'moo';

GRANT ALL PRIVILEGES ON DATABASE "Pancakes" TO "Matt";

Save this to a file. Change the username/password if you like but be aware that the sample app has these values hard-wired into the connection string. Then execute the following from the command line:

psql -U postgres -a -f "C:\path\to\sqlfile.sql"

At this point, you can launch the application and create events that will show up on the map. If you changed the username and/or password, you’ll need to update the Postgres connection string first.

You might have noticed that you didn’t create any tables yet but the app still works. The sample is helpful in this regard because all you need is a database. If the tables aren’t there yet, they will be created the first time you launch the app.

Note: recipes rely on having a search provider configured. We won’t cover that here but I hope to come back to it in the future.

Next, we’ll switching things up so you can run this against Postgres running in a Docker container.

Switching to Docker

I’m going to give away the ending here and say that there is no magic. Literally, all we’re doing in this section is installing Postgres on another “machine” and connecting to it. The commands to execute this are just a little less click-y and more type-y.

The first step, of course, is installing Docker. At the time of writing, this means installing Docker Machine.

With Docker Machine installed, launch the Docker Quickstart Terminal and wait until you see an ASCII whale:

Docker Machine

If this is your first time running Docker, just know that a lightweight Linux virtual machine has been launched in VirtualBox on your machine. Check your Start screen and you’ll see VirtualBox if you want to investigate it but the docker-machine command will let you interact with it for many things. For example:

docker-machine ip default

This will give you the IP address of the default virtual machine, which is the one created when you first launched the Docker terminal. Make a note of this IP address and update the Postgres connection string in your web.config to point to it. You can leave the username and password the same:

<add name="Postgres" connectionString="server=192.168.99.100;user id=Matt;password=moo;database=Pancakes" providerName="Npgsql" />

Now we’re ready to launch the container:

docker run --name my-postgres -e POSTGRES_PASSWORD=moo -p 5432:5432 -d postgres`

Breaking this down:

docker run Runs a docker container from an image
--name my-postgres The name we give the container to make it easier for us to work with. If you leave this off, Docker will assign a relatively easy-to-remember name like “floral-academy” or “crazy-einstein”. You also get a less easy-to-remember identifier which works just as well but is…less…easy-to-remember
-e POSTGRES_PASSWORD=moo The -e flag passes an environment variable to the container. In this case, we’re setting the password of the default postgres user
-p 5432:5432 Publishes a port from the container to the host. Postgres runs on port 5432 by default so we publish this port so we can interact with Postgres directly from the host
-d Run the container in the background. Without this, the command will sit there waiting for you to kill it manually
postgres The name of the image you are creating the container from. We’re using the official postgres image from Docker Hub.

If this is the first time you’ve launched Postgres in Docker, it will take a few seconds at least, possibly even a few minutes. It’s downloading the Postgres image from Docker Hub and storing it locally. This happens only the first time for a particular image. Every subsequent postgres container you create will use this local image.

Now we have a Postgres container running. Just like with the local version, we need to create a user and a database. We can use the same script as above and a similar command:

psql -h 192.168.99.100 -U postgres -a -f "C:\path\to\sqlfile.sql"

The only difference is the addition of -h 192.168.99.100. You should use whatever IP address you got above from the docker-machine ip default command here. For me, the IP address was 192.168.99.100.

With the database and user created, and your web.config updated, we’ll need to stop the application in Visual Studio and re-run it. The reason for this is that the application won’t recognize that we’ve changed database so we need to “reboot” it to trigger the process for creating the initial table structure.

Once the application has been restarted, you can now create pancake breakfast events and they will be stored in your Docker container rather than locally. You can even launch pgAdmin (the Postgres admin tool) and connect to the database in your Docker container and work with it like you would any other remote database.

Next steps

From here, where you go is up to you. The sample application can be configured to use Elastic Searchfor the recipes. You could start an Elastic Search container and configure the app to search against that container. The principle is the same as with Postgres. Make sure you open both ports 9200 and 9300 and update the ElasticSearchBaseUri entry in web.config. The command I used in the presentation was:

docker run --name elastic -p 9200:9200 -p 9300:9300 -d elasticsearch

I also highly recommend Nigel Poulton’s Docker Deep Dive course on Pluralsight. You’ll need access to Linux either natively or in a VM but it’s a great course.

There are also a number of posts right here on Western Devs, including an intro to Docker for OSX, tips on running Docker on Windows 10, and a summary or two on a discussion we had on it internally.

Other than that, Docker is great for experimentation. Postgres and Elastic Search are both available pre-configured in Docker on Azure. If you have access to Azure, you could spin up a Linux VM with either of them and try to use that with your application. Or look into Docker Compose and try to create a container with both.

For my part, I’m hoping to convert the sample application to ASP.NET 5 and see if I can get it running in a Windows Server Container. I’ve been saying that for a couple of months but I’m putting it on the internet in an effort to make it true.

Posted in ASP.NET MVC, Docker, PostgreSQL, Presenting | Tagged , , , | Leave a comment

Are you in the mood for HTTP?

Sometimes you are just in that mood (no, not that mood ;-)), you know that mood when you want to talk HTTP and APIs with a bunch of people that care. Recently Darrel Miller and I realized we’re in that mood, and so with a little nudging from Jonathan Channon, we decided now is a good time. And so, “In the Mood for HTTP” was born.

It is a new Q&A style show, where folks submit questions on all things HTTP and Darrel and I give answers. Every show is live via Google Hangouts on Air, AND it is recorded and immediately available. In terms of the content, one thing I think that is really nice is we’re getting to dive into some really deep areas of building APIs that are not well covered. For example what level of granularity of media types should you use? Do Microservices impact your API design? And much more!

We’re not always in agreement, we’re not always right. We do always have fun!

Read Darrel’s blog post which goes into more detail of that what and why, then come join us! You can find the previous episodes on our YouTube channel here.

Posted in Uncategorized | Leave a comment

Save your DNS and your SANITY when using VPN on a Mac (without rebooting)

There was time when using my Mac was bliss form a DNS perspective, I never had to worry about my routing tables getting corrupted. I could always rely on hosts getting resolved, life was good! And then a combination of things happened and well those good old days are gone :-(

  • The networking stack on OSX went downhill.
  • I joined Splunk.
  • I started using a VPN on my Mac (We use Juniper SSL VPN).
  • I started having to deal with this now recurring nightmare of my DNS suddenly failing, generally after using the VPN.

If you use a VPN on a Mac, I am sure you’ve seen it. Suddenly you type “https://github.com” in your browser, and you get a 404. “Is Github down?” you ask your co-workers? “Nope, works perfectly fine for me”. “Is hipchat down?”. “Nope, I am chatting away.”

Meanwhile. your browser looks something like this:

Screen Shot 2015-10-01 at 7.08.31 AM

 

 

 

 

 

 

AAARGH!

 

 

 

 

 

So you reboot, and then you find out that Github was up all along, the problem was your routing tables got screwed somehow related to the VPN, either that or the DNS demons have taken over your machine!

demons-1

 

 

 

 

 

After dealing with this constantly, you start to seriously lose your sanity! It will of course always happen at the most inopportune time, like when you are about to present to your execs or walk on stage!

But my friends, there is hope, I have a cure! This is a cure I learned from the wise ninjas at my office (Thank you Danielle and Itay!), it is a little bash alias, and it will save you AND your DNS. Drop it in your .bash_profile and open a new terminal.

alias fixvpn="sudo route -n flush && sudo networksetup -setv4off Wi-Fi && sudo networksetup -setdhcp Wi-Fi"

Next time the DNS demons come to get you, run this baby from the shell. It will excommunicate those demons and quick.

Screen Shot 2015-10-01 at 7.09.48 AM

 

 

 

 

 

 

Wait a few seconds, and bring up that webpage again.

Screen Shot 2015-10-01 at 7.09.57 AM

 

 

 

 

 

 

You DNS and Sanity are restored!

Posted in tips, Uncategorized | 10 Comments