Michael Angstadt's Homepage

Blog

How this page works

More blog entries >>

01/22/2020 9:05 pm
The library where I work subscribes to an online service that keeps track of the library's ongoing public events. We link to it from our website so that patrons can discover the various programs the library has to offer.

I wrote a WordPress plugin that posts a listing of these events on the library's events page using the RSS feed the service provides. I just noticed today that a few of the event titles had empty, square boxes in them. When you see this, it usually means there is a character encoding problem, which means that it does not recognize a particular letter or symbol.


To make page loads more performant, my plugin caches the RSS file it downloads so that it does not have to query the event service every time someone loads the events page on the website. I opened the cached RSS file to see what might be causing the problem. The event in question had curly quotes (also called smart quotes) in its title. My experience has been that curly quotes frequently cause problems when people try to use them on websites, so I wasn't surprised to see this.


(table from computerhope.org)

You can't type a curly quote on the keyboard. At least, not directly. In my experience, they usually appear when someone copies and pastes something from Microsoft Word because Word will automatically insert them into your document as you're typing to make your document look more aesthetically pleasing.

It turned out that the problem was with the RSS data's character encoding. An RSS file is just XML, and every XML file has an "encoding" attribute at the top.

<?xml version="1.0" encoding="UTF-8" ?>

This tells the program that parses the XML what kind of character set the XML data uses so that all of its content will remain intact after being parsed. If this attribute does not reflect the character set that was actually used to create the XML file, the data may not be parsed correctly, and you may end up with "empty boxes" like the ones I was getting.

When I changed this encoding attribute to "UTF-8" (a widely used character encoding that supports many different languages), the empty boxes went away, and the curly quotes correctly appeared.


To prevent this from happening in the future, I modified my WordPress plugin to change the RSS feed's character encoding right before it is saved to the cache. Doing a simple str_replace function call seemed to do the trick. I thought I might have to use the mb_convert_encoding function to do a thorough conversion of the entire file, but this did not appear to be necessary.
01/20/2020 9:41 pm
I discovered an annoying issue recently related to our switch from a Windows-based file server to a Linux-based one.

The computer lab in the library where I work uses an Access database to record registration information of students who sign up for the adult computer classes that the lab runs. Several years ago, I wrote a Java Swing application that sits on top of this database to make the registration process faster and easier. I've gotten many complements about it, and it's been working well over the years with few hiccups.

The database file is hosted on a dedicated server that I administer, which the lab uses for file storage. This means that multiple people could be accessing the database on different computers at the same time. The app must consider the possibility of another user on another machine updating the database while the app is open. To do this, I designed the app to monitor the database file for changes. When a change occurs, it reloads the information from the database so that the user always sees the most recent information on the screen.

About a year ago, I started researching a replacement for our file server. It was over 5 years old and out of warranty, and its version of Windows (Windows Server 2008) was fast approaching end-of-life. Because we mainly used it for file storage, I recommended we get a Linux-based NAS device because they are cheaper than a full-blown Windows server.

The two major competitors in the NAS space are QNAP and Synology. I decided to go with QNAP because they offered a model that came with more RAM and an HDMI port (most NAS devices do not come with any video ports). My reasoning was that the HDMI port would allow me to administer the device just like a normal server if the network ever went down.

The library purchased it a few months ago, and I deployed it last week. It has been working well so far, but one hiccup I've encountered is that, ever since I deployed it, the class registration app I wrote has been slow to detect updates to the database file. Like, really slow. About 10 seconds slow.

I ran some tests and found that when my app updated the database, the "date modified" timestamp of the database file was slow to update. This was why my app was taking so long to detect any changes to it. But when I wrote a test program that just saved some content to a text file on the file server, the text file's timestamp would update immediately. Why, then, was the Access file behaving differently?

I thought that maybe the open-source library my app uses to interface with the Access database, called "Jackcess", was doing something different under the hood. I started a thread on their support forum and learned that they use Java's "RandomAccessFile" class to write to the database. This class allows you to update small sections of a file individually instead of re-writing an entire file from scratch, which is how most computer programs deal with updating files.

So, it looks like the culprit here is the new Linux-based NAS server. It seems to handle these kind of file updates differently than the old Windows server did. The simplest workaround I can think of would be to create some kind of dummy file and monitor that for changes instead of the database. The app would then update the dummy file whenever it makes a change to the database, which would cause the dummy file's "date modified" timestamp to update, signaling to other computers on the network that the database was just changed. That should hopefully get things back to normal!
06/03/2019 8:25 pm
I stumbled upon a thread on the r/sysadmin board with tons of great Windows tips. Here is a list of the ones I found the most interesting:

File Explorer to cmd
Open File Explorer and navigate to any directory. Click into the address bar, type "cmd", and press Enter. A command prompt window will open at that location (source).

cmd to File Explorer
Conversely, open a command prompt window and type "start .". A File Explorer window will open to whatever your current directory is (source).

cmd and F7
Press F7 at a command prompt to display a selectable command history. It doesn't work well for really long commands, but for shorter ones it works great (source).




Copy as path
In File Explorer, hold Shift while right-clicking a file. This makes a "Copy as path" option available on the context menu. Clicking this will copy the file's absolute path to the clipboard (source).

Copy text in dialog boxes
It is possible to copy the text in a dialog box, even if you can't highlight it! This is great for extracting error messages so you can Google them. Simply click on the dialog box to make sure it has focus, then press Ctrl+C. In my testing, this didn't work with Microsoft Office dialog boxes, so it may not work everywhere (source).

For example, below is the pasted content from a Notepad dialog box.
[Window Title]
Notepad

[Main Instruction]
Do you want to save changes to Untitled?

[Save] [Don't Save] [Cancel]
Screenshots of windows
If you're not using any third-party screenshot tools, pressing Alt+PrtScn copies a screenshot of the currently active window to the clipboard. No more having to crop full-screen screenshots with Paint (source)!

And here are a couple tips of my own:

Desktop keyboard shortcuts
It's possible to assign keyboard shortcuts to desktop icons. Right-click on the icon and click "Properties". Click into the "Shortcut key" field and type a letter. The shortcut will now be Ctrl+Alt+whatever letter you typed.



Show Desktop
I use this one all the time. Pressing Win+D hides all of your windows so you can see the desktop. Useful!

Emoji keyboard
Last but not least: While inside a textbox, pressing Win+. (Windows key + period key) to open an emoji keyboard.

04/18/2019 6:33 pm
At my workplace, I administer a Windows Server 2008 RC2 web server that makes daily backups to an external hard drive using Windows Server Backup. The backups have been consistently failing recently, and I've found out why.

Even though the backups were not finishing successfully, Windows Server Backup labeled the backups as “Completed with warnings” because it was able to successfully backup some of the data before failing. The backups would always fail at the same point: after backing up approximately 12 GB of data. The failures were intermittent at first, but became more consistent over time to the point where they were happening every night.

Error message

The error message Windows Server Backup was giving me was very vague:
"The operation failed due to a device error encountered with either the source or destination. Detailed error: The request could not be performed because of an I/O device error."

 The error message in the event logs was equally vague:
“The backup operation that started at
(Contrary to the message’s suggestion, I could not find anything useful in the event details.)

Bad sector?

Many websites suggested that this error could be caused by a bad sector on the source or destination drive. I ran chkdsk against the 1 TB external hard drive that the backups were being stored on. The command took about 5 hours to run. It was alarming to see that the command consumed 4 GB of memory while running, but according to numerous websites, this is normal. No errors were found.

Unfortunately, I couldn’t run chkdsk on the server’s hard drive because it would require that the machine be put out of service.  Chkdsk cannot be run on the system drive while the server is operational—it must reboot and then run before booting into Windows. Since the server hosts our website and another library’s website, I couldn’t take it offline without advance notice. So, that wasn’t an option.

USB port?

I tried plugging the external drive into a different USB port, thinking maybe the USB port was bad. A long shot, but you never know. No effect.

Large file?

One website suggested it could be failing because a file was too large. I knew of one particularly large file on the system: “C:\MySQL Data files\ibdata1”. This file is where MySQL stores its databases. The file is over 1 GB, so I tried excluding that file from the backup.

(In order exclude specific files from the backup, I had to disable the “Bare metal recovery” option from the backup settings.)

Reparse points

Before I could complete the configuration wizard, I got an error message:
“One of the paths specified for backup is under a reparse point. Back up of files under a reparse point is not supported. Specify a file path that contains the destination of the reparse point, and then retry the operation.”
What is a reparse point? A reparse point is kind of like a shortcut. For example, there is a reparse point for the old “C:\Documents and Settings” folder that just redirects to the “C:\Users” folder.

To see all the system’s reparse points, type the following command at the root of the drive:

dir /s /al

I found that the server had a LOT of reparse points. Most of them were in the user folders and were for the old Windows XP common folders names (for example, a reparse point named “My Pictures” existed, which redirected to the “Pictures” folder)

Due to the number of reparse points on the system, excluding all of them individually wasn’t a practical option. I decided I would just exclude the “Documents and Settings” reparse point to see if that would be enough.

This time, I could complete the configuration wizard and run the backup. The backup even completed successfully (though, it took 8 hours to complete, which is much longer than the 1 hour it used to take).

The solution

The backup was successful, but it failed to back up a one, single file—an old IIS log file. The error message was:
“Error in backup of during read: Error [0x8007045d] The request could not be performed because of an I/O device error.”
Another vague “I/O device error” message. I tried copying the file to the desktop to see if the file could be read. The copy operation failed with a similar error.

This made me think that there was something corrupted about the file, which was causing the backup to fail. Maybe a bad sector on the hard drive. I deleted the file, changed the backup settings back to “Bare metal recovery” and ran the backup again. The backup succeeded!




08/12/2018 6:32 pm
The public access computers at the library where I work use software called Deep Freeze, which prevents any changes to the computer from being persisted between boots. Rebooting the computer reverts everything back to the way it was since the last reboot. The software is essential for a public-access environment, as it prevents users from doing any long-term damage to the system and also helps with privacy.

Deep Freeze does its job wonderfully, but I recently started noticing some issues with installing Windows Updates when we switched to Windows 10. The reason I think these problems are caused by Deep Freeze is our staff computers, which are nearly identical to our public ones, do not have Deep Freeze installed on them, and they have not experienced these problems.

Problem 1: "Undoing changes"

During the phase of the update process when the updates are installed after you reboot the computer, Windows reports that the updates could not be installed and that it's "undoing changes".



The solution that I discovered was to run the following commands BEFORE checking for updates. These commands must be run from an admin-level command prompt.

sfc /scannow
dism /Online /Cleanup-Image /RestoreHealth

If you ever had to troubleshoot a Window problem, odds are you have seen these two commands before, as they are floating all over the Internet in tech help forums. I like to think of them as general-purpose troubleshooting commands that are good to run if you are have any problem with the Windows operating system itself. There's also never any harm in running them.

The first command checks Windows' operating system files for corruption. In my case, it always reports that it found corrupted Windows files and that it fixed them. The second command, in my case, doesn't report that it found any problems, so it may not be necessary for this particular problem.

Since this problem has reoccurred so many times for me, I have now made it a part of my routine to run these commands before checking for updates.

Problem 2: Booting to the blue "Automatic Repair" screen


This problem only happens after installing Windows 10 feature updates (as opposed to "quality" updates, which are smaller and more frequent). When the computer is turned on, it sometimes (but not always) boots to a blue screen titled "Automatic Repair" (pictured below).


This screen will either (a) report that it repaired some problems and prompt you to reboot your computer, or (b) report that it couldn't repair the problems and prompt you to shutdown your computer. In the latter case, clicking "Advanced options", then "Continue" will boot the computer normally. The screen appears roughly half the time the computer is turned on.

The solution to this problem is first to uninstall Deep Freeze. Then, run the two commands above. Finally, reinstall Deep Freeze.

To prevent this issue from happening in the first place, uninstall Deep Freeze before installing the Windows update.
More blog entries >>

How this page works

Last Updated: 1/3/2012

My blog is actually hosted on blogger.com. The way I'm able to display my blog posts here is by parsing the blog's RSS feed. RSS feeds are used by blogs to help alert their avid readers whenever a new post is created. They are just XML files that contain data on the most recent blog posts. They include things like the title and publish date of each post, as well as the actual blog post text. I can use most of the data from my RSS feed without any trouble, but there are a few things I need to tweak in order to display everything properly.

View the source

Fixing the code samples

One tweak is fixing the code samples I often include in my posts. Blogger replaces all newlines in the blog post with <br /> tags. This is a problem because, due to the syntax highlighting library I use, the <br /> tags themselves show up in the code samples. So, I need to replace all of these tags with newline characters. However, I can't just replace all <br /> tags in the entire blog post because I only want to replace the tags that are within code samples. This means that I have to use something a little more complex than a simple search-and-replace operation:

$content = //the blog post
$contentFixed = preg_replace_callback('~(<pre\\s+class="brush:.*?">)(.*?)(</pre>)~', function($matches){
	$code = $matches[2];
	$code = str_replace('<br />', "\n", $code);
	return $matches[1] . $code . $matches[3];
}, $content);

Here, I'm using the preg_replace_callback PHP function, which will execute a function that I define every time the regular expression finds a match in the subject string. I know that each code sample is wrapped in a <pre> tag and that the tag has a class attribute whose value starts with "brush:", so I use that information to find the code samples. Then, for each match the regular expression finds, it calls my custom function, where I have it replace the <br /> tags with newlines.

Fixing the dates

Because the publish dates of each blog post in the RSS feed are relative to the UTC timezone, I also have to make sure to apply my local timezone to each date. Otherwise, the dates will not be displayed correctly (like saying that I made a post at 2am in the morning).

$dateFromRss = 'Tue, 20 Dec 2011 02:30:00 +0000';
$dateFixed = new DateTime($dateFromRss);
$dateFixed->setTimezone(new DateTimeZone('America/New_York'));

Adding Highslide support to images

One extra feature that I included is adding Highslide support to each image (Highslide is a "lightbox" library which lets you view images in special popup windows). To do this, I load the blog post into a DOM, use XPath to query for all links that have images inside of them, and then add the appropriate attributes to the link tag.

$content = //the blog post

//XML doesn't like "&nbsp;", so replace it with the proper XML equivalent
//see: http://techtrouts.com/webkit-entity-nbsp-not-defined-convert-html-entities-to-xml/
$content = str_replace("&nbsp;", "&#160;", $content);

//load the text into a DOM
//add a root tag incase there isn't one
$xml = simplexml_load_string('<div>' . $content . '</div>');

//if there's a problem loading the XML, skip the highslide stuff
if ($xml !== false){
	//get all links that contain an image
	$links = $xml->xpath('//a[img]');
	
	//add the highslide stuff to each link
	foreach ($links as $link){
		$link->addAttribute('class', 'highslide');
		$link->addAttribute('onclick', 'return hs.expand(this)');
	}

	//marshal XML to a string
	$content = $xml->asXML();
	
	//remove the XML declaration at the top
	$content = preg_replace('~^<\\?xml.*?\\?>~', '', $content);
	
	//trim whitespace
	$content = trim($content);
	
	//remove the root tag that we added
	$content = preg_replace('~(^<div>)|(</div>$)~', '', $content);
}

As you can see, the blog post text has to be awkwardly manipulated in order to be read into a DOM and written back out as a string. That's why I have a lot of comments here--when I have to revisit this code in 6 months, I won't be totally confused.

Caching the RSS file

One last thing to mention is that I cache the RSS file so that my website doesn't have to contact Blogger every time someone loads this page. When the cached file gets to be more than an hour old, a fresh copy of the file is downloaded from Blogger.

Back to top