How-To
2020

How to Use Kubernetes on Azure for Cloud Computing
For the Review, Appraisal, and Triage of Mail (RATOM) project, funded by the Andrew W. Mellon Foundation, we were tasked with deploying to a Microsoft Azure environment. More details about the project are in our first blog post in this Learn With Us blog series. Caktus has experience with Amazon Web Services (AWS) and Google Cloud, but we hadn't had the opportunity to use Azure yet, so we looked forward to the opportunity to use that environment and document our experience. The entire deployment process is available on GitHub as a reference under the StateArchivesOfNorthCarolina/ratom-deploy repository.

What to do About Email: How to Extract Data from Microsoft PST Files
In my previous line of work as an archivist, the question of what to do about email archives was an ongoing and deeply-considered topic. Email is everywhere. Yes, even Gen Z and millennials use it, despite thousands of think pieces that would have you believe that the old ways are giving way to business meetings conducted on fixed-gear bicycles, over avocado toast and Instagram.

Creating a Sub-account in Amazon Web Services
Anytime we host resources for a client within the Caktus Amazon Web Services (AWS) account, we set up a sub-account and put the resources there. Some of the advantages of doing this compared to putting a client's resources in our main account are:

How to Use the "docker" Docker Image to Run Your Own Docker daemon
There exists on Docker hub a Docker image called docker. It also has two flavors, "stable" and a "dind" (Docker-in-Docker). What is this image for and what is the purpose of these two different image tags?

Our Top 19 Blogs of 2019
During the last year we gave our popular technical blog an official name: Developer Access. We published 32 posts on the blog, including technical how-to’s, conference information, web development best practices and detailed guides. Among all those posts, 19 rose to the top of the popularity list (based on total pageviews):
2019

How to Do Wagtail Data Migrations
Wagtail is a fantastic content management system that does a great job of making it easy for developers to get a new website up and running quickly and painlessly. It’s no wonder that Wagtail has grown to become the leading Django-based CMS. As one of the creators of Wagtail recently said, it makes the initial experience of getting a website set up and running very good. At Caktus, Wagtail is our go-to framework when we need a content management system.

How to Import Multiple Excel Sheets in Pandas
Pandas is a powerful Python data analysis tool. It's used heavily in the data science community since its data structures make real-world data analysis significantly easier. At Caktus, in addition to using it for data exploration, we also incorporate it into Extract, Transform, and Load (ETL) processes.

How to Set Up a Centralized Log Server with rsyslog
For many years, we've been running an ELK (Elasticsearch, Logstash, Kibana) stack for centralized logging. We have a specific project that requires on-premise infrastructure, so sending logs off-site to a hosted solution was not an option. Over time, however, the maintenance requirements of this self-maintained ELK stack were staggering. Filebeat, for example, filled up all the disks on all the servers in a matter of hours, not once, but twice (and for different reasons) when it could not reach its Logstash/Elasticsearch endpoint. Metricbeat suffered from a similar issue: It used far too much disk space relative to the value provided in its Elasticsearch indices. And while provisioning a self-hosted ELK stack has gotten easier over the years, it's still a lengthy process, which requires extra care anytime an upgrade is needed. Are these problems solvable? Yes. But for our needs, a simpler solution was needed.

How to Switch to a Custom Django User Model Mid-Project
The Django documentation recommends always starting your project with a custom user model (even if it's identical to Django's to begin with), to make it easier to customize later if you need to. But what are you supposed to do if you didn't see this when starting a project, or if you inherited a project without a custom user model and you need to add one?

Coding for Time Zones & Daylight Saving Time — Oh, the Horror
In this post, I review some reasons why it's really difficult to program correctly when using times, dates, time zones, and daylight saving time, and then I'll give some advice for working with them in Python and Django. Also, I'll go over why I hate daylight saving time (DST).