Saturday, May 23, 2020

We're Out of Serial



One of the more frustrating and stressful events I will share in this blog (to-date anyway) involved the Sun Solaris 10 SPARC systems that were running the Oracle database server for our ERP system.

By this point in my career, I was hired into my first full-time computer-related job as a Systems Administrator in the central IT department. I was given the assignment of being the primary SysAdmin for the ERP implementation project, and I had approximately 4 weeks plus a training course worth of experience running Solaris 10 servers prior to this. Needless to say, I was in a bit over my head.

The event that occurred stemmed from the Sun cluster that was used to provide high-availability for the database - where there were two servers in the cluster and the database could failover between servers with a floating IP address that would move with the active instance. Mind you, this was entirely new architecture within IT, and I had one other coworker who had some semblance of Solaris experience, though none beyond version 9. Cool.

Some details of this event are a bit foggy after 14 years, but I will share what I can.

The first indication there was a problem was the phone ringing on my desk concurrent to instant messages blowing up on my computer. They were developers and a DBA from the ERP project team, and it seems they were no longer able connect to the servers. Not good. I asked them if they were using the right IP address (yes, for whatever reason, these folks never took to using DNS names, and they had a habit of pointing to the static IP rather than the float IP). They assured me they tried all of the IPs (which I bought, because normally they would just say they were using the correct one and it would magically start working once they had me on the phone).

I tried connecting myself - to no avail - and determined that indeed there was an issue; it was particularly telling that ALL of the static IPs and the float IP were timing out across both servers. This was very not good.

There is something you need to know about Sun servers - when you first bring them up, you can either connect to a local serial port for the initial boot and setup, or you can use DHCP to automatically assign an IP and connect remotely. We used DHCP, so it was easy enough to setup from a remote session. Unlike any other server in our data center though, these did not have a VGA connector into which you could plug in a monitor and keyboard when remote access fails. Typically though with Sun server, once you got them set up that first time, you never needed to touch the serial port again; they were pinnacles of reliability.

Until they weren't.

It was about this point that I could feel the color leaving my skin, as I had never had to touch the serial port on these before. Did we even have the right connector for them? Possibly. Did I know how to use it? NOPE.

Sun would ship one fancy metal rs232 db9 to rj45 connector per server. And since our ERP ran on 10 or so Sun servers, we fortunately had plenty of the db9 to rj45 connectors around. Easy enough, find a standard cat-5 patch cable with rj45 terminations on both end, plug in one of the db9 to rj45 connectors on each end, connect one to the server the other to your laptop, and everything's peachy.... Not so fast. Plug all this in together, and there is nothing on screen - how could this be? Maybe one of the connectors was faulty? I tried several others, no dice.

A coworker had a usb to serial connector (don't ask why, no clue), so I asked to borrow that. Plugged it in, fired up Hyperterm and I got mostly gibberish. Sure there were characters on the screen, and I could type, but nothing was happening. It did not seem to recognize any of the commands I sent. It was progress, but only barely. This is not good. My suspicion was that the pin-out for the Sun server and the expected transmission pins in the USB to serial connector were not the same, and Hyperterm was not sophisticated enough to renegotiate that serial connection to make it work.

(Quick aside - looking back on this with significantly more experience, there were probably changes I could have made in Hyperterm to get that USB connector work properly. Oh well, didn’t happen.)

Why wouldn't Sun just ship a db9 to an ANYTHING YOU COULD FRIGGIN' PLUG INTO A LAPTOP connector???

After a bit of research (aka, Googling frantically for any lead I could find), I found a pin-out table for the Sun db9 connector (Like this one). That's when it dawned on me - it's the same problem you have with network cables! You cannot connect just a standard patch cable from one server to another, you have to use a cross-over cable so your transmit and receive pins line up properly.

For the next 45 minutes, I found myself cracking open the case of one of the db9 connectors, pulling out all of the pins, and then matching opposing pins to the associated cross-over equivalent. I plug one end of this Frankenstein cable to the server, the other end into the serial port of my laptop, fire up Hyperterm, and up appears a scrolling list of errors (recognizable as some of the previous gibberish) and this time, I could send commands and receive responses - amidst the scrolling errors. I send a simple "init 6" command to each of the servers, and when they come back up the issue seems to have resolved. Still no clue as to what caused the cluster to lose its mind, but at least I now had a cable to use should the same fiasco happen again.

Thursday, May 21, 2020

Time for a Reboot


It's time for a recharge and a reboot.

Similar to how computers that are left running at a constant clip get bogged down with memory clogs and orphaned processes and threads, so too does the human body. It can get stuck in a rut sometimes just going through the motions as if on autopilot, all the while wearing down with years of pent-up stress and frustration. Every once in a while a recharge and a reboot becomes necessary.

This is mine. I am considering this both the end of my past and the beginning of my future.

I have spent too many years mired in the culture, constraints, and complacence of my relatively stable career. I'm generally happy with what I do. I definitely enjoy the team of folks with whom I work. But a lot has happened over the past couple months that have brought new perspective to who I am and what I am capable of doing.

There is a fork in the path ahead of me. One direction provides relative certainty as to where it ends. Sure there are a few bumps along the way, especially in the portion closest that is easiest to see, but the horizon has a fairly reliable destination that is not terribly unpleasant. However, it lacks adventure and excitement and that is partially because it is also relatively low risk. It's like walking on a paved bike & hike trail.

The other path is rugged and unsteady, and has boulders in place of the bumps. This path is also much more difficult to see where it ends up, or how many twists and turns it will have along the way. There may be unseen perils lurking just out of eye-shot. There may be places where it's hard to even tell where the path picks up on the other side. And there may be drop-offs along the way, where one bad step could lead to catastrophe. This path also has mystery about it. Perhaps there are even uncharted portions where I would become the first one paving the way. Will it lead where I want to go? Perhaps. More than likely, it will lead wherever I am willing to take it - and that may require having a larger appetite for risk and almost certainly will require more work.

"Why," you might ask, "would you be willing to sacrifice the more certain path?" The answer is simply: because it is tiring and tedious. I've been running along this path a long time. I pass the same people day-in and day-out, and they are all great people, and they cheer me on along the way. My body is so accustomed to walking along this path, it's hardly even a workout any more. I am craving something new. I am craving challenges that I have never had to face. I am craving meeting new people and building new connections. And I am craving new sights and sounds to tease my senses.

Could choosing this new path be a giant mistake?
Perhaps.
Is there a tremendous amount of risk in doing this?
Almost certainly.
And am I making a giant wager in this choice?
Absolutely.

But I am betting on myself. I know that I am capable of doing more. I know that I have been the victim of circumstance along the way. And I know that I deserve to get as much enjoyment and fulfillment out of my career as many others have achieved for themselves.

I worry that my outlook is naive and short-sighted; that even though I think I know what risks lie ahead, they are actually far greater than I can imagine. I also worry that I take for granted the stability that I have been afforded by traveling along the same old path. And I worry that I will quickly regret this decision. I may. And I may very well find that, once again, the grass really is no greener anywhere else. Maybe so. Maybe not. I certainly won't know if I don't give it a shot. I just hope that if anything does go south, that I can find enough of a safety net to still catch myself. That is my greatest fear in all of this.

If I had more connections that I could lean on to steer me in the right direction and help catch me if I were to fall, this would not be as difficult a choice. But you see - it's for exactly that reason, that I must take this leap of faith. I will never build those connections if I stay where I'm at. I've been kept locked up for far too long. Well, not any more.

Hello World!

As I set out on this new path, grabbing my trail shoes and walking stick for the treacherous path that lies ahead, I look into the great unknown with nervous anxiety, anticipation, wonder, uncertainty, thirst, fear, and excitement.

I just hope I'm making the right decision.

Sunday, May 17, 2020

Cracking My Own Cuckoo's Egg


While not nearly as fantastical and incredible as the true story Cliff Stoll shared in "The Cuckoo's Egg," I too had the opportunity to experience a similar thrill in catching a "cyber criminal" who was up to no good.

This story took place while I was a college senior working for the College of Business Technology Support Center. By this point I had risen to the rank of Lead Student LAN Administrator, and was in charge of leading the other tech support student workers as well as building and managing some of the servers used within the college. In our sizable storage room, we had some space set aside where I and some of my coworkers could build development servers for independent learning.

One evening, I received a message from one of my coworkers who was out of town for the weekend saying that he thinks someone else had broken into the server he had built. When he built the server, it was running Windows Server 2000, however when he went to RDP into it, he found that it was now running Windows Server 2003 - and he swears he did not install 2003. This seemed odd, especially considering if someone broke in remotely, they would have to figure out a way to load the media, perform the install and ensure they could reconnect to the system afterwards. I found a few articles that suggested this could be done if they were doing an in-place upgrade, so the story seemed plausible enough.

Unfortunately, whoever had taken possession of my coworker's server had also removed their access to the server. Without the ability to login to the system, it was going to be more difficult to figure out what was going on. Additionally, if this was someone local (which was my hope) I did not want to tip them off that I was on to them, figuring that if they found out, they would wipe the system of anything that could trace back to them. How could I passively recon information and evidence without tipping them off? Sure, I could just disconnect the server from the network, comb through the logs and hope there was evidence that pointed back to them, however I really wanted to catch them in the act - but how?

Several things came to mind.

First, I ran Nmap against the system to see what ports were open in order to get a sense of what they might be doing with the server. The most interesting to note from this was that port 194 was open and appeared to be running some form of an IRC server. My coworker had not been using this for IRC - this was new. Aside from that, the rest of the ports were typical windows system ports, including RDP (3389) and SMB (139 & 445).

This at least gave me some direction on what to look for, if I could sniff the traffic. While it was not connected to a switch from which I could perform traffic mirroring, since I had physical access to the system, I could insert a device between it and the wall from which I could sniff the traffic. Network hubs were more commonly used back then, and we fortunately still had a few unused ones laying around. The beauty of a hub as opposed to a switch, is that a hub blindly sends copies of all the traffic to all of the interfaces (as opposed to a switch which only sends the traffic to the appropriate destination).

So here was the next plan - disconnect the cable from the wall, connect it to the hub, then connect another cable from the hub to the wall. Next, I would grab a laptop, install WinPCAP and Wireshark (this was before I became proficient with Linux), and connect that to the same hub.

Once I had everything setup, I needed to be quick in switching out the cables - again, to avoid tipping them off. And viola! Just like that I could see all the traffic flowing to and from the server. Working in my favor was that IRC sent all of the messages in clear text, so I could see all of the conversations streaming across the wire. This pretty much gave me everything I needed to start tracking down this pest. I ran capture after capture of the traffic to see if I could tell the source IP address for the person in control of the server.

I noticed pretty quick that one IP, which happened to originate from an on-campus residence hall network (per the DNS name), was consistently connecting to the IRC port. While not conclusive, it was a lead. The real break though, came after a few days of monitoring other activity originating from that IP. I happened catch a connected session established from that same IP to the RDP port. A ha! This was my person. After a quick call to Network Services to trace the IP, they were able to provide me with the switch it was associated to, and the associated room number into which it was patched. So I had an IP address and a residence hall room number, all I needed was a name.

I find that it pays to have friends in many areas and departments on campus, including Residence Services. I reached out to my friend there, who happened to be a fellow student employee that helped manage the housing database, to see if they might be willing to look up some information for me. They had no issue with that - while not publicly available, this was directory information that did not require consent or approval to hand out. I told him the room number, and he gave me the name - and wouldn't you know it! The occupant of that room was a student who had previously worked in the College of Business Technology Support Center. Gotcha!

I gathered up all of the information I collected, wrote up my findings, and preserved the evidence from Nmap and Wireshark and turned over my findings to my boss. He promptly contacted the University's recently formed information security office so they could take action. Following a conversation with the student about how even as innocent in the greater scheme of things as this was, it was still considered breaking the law - and brought along the threat of expulsion - the student wisely confessed, apologized for their mistakes, and agreed not to misuse any more university computers or networks. Getting to see their embarrassment for getting caught was satisfying enough for me.




Wednesday, May 13, 2020

Frozen Bytes


One trick I learned early in my computer career served me well on numerous occasions: freezing hard drives.

While this certainly did not work every time, for many issues with the older platter-style hard drives, simply putting it in the freezer for anywhere from 3 to 24 hours could temporarily resurrect a drive. Common ailments this could treat included the dreaded "click of death", the "unrecognized media" errors and spontaneous system faults (which typically occur leading up to the other two).

Often what I found worked best was freezing the drive, then running the Steve Gibson program, Spinrite, on it to detect problem sectors, copy the data to functioning sectors, and then flagging the problem sectors so they were not written to again. At this point you could either put the drive back into service (not recommended), or if it continued to exhibit problems, copy the data to a new drive and dispose of the old one. I found using either the Windows safe mode or a bootable BartPE disk would work best for copying the data. Since it was mounted using just a lightweight operating system, fewer resources were consumed, and it was least impactful on the drive itself since the operating system would not run processes to index all of the files.

One drive I dealt with was particularly problematic.

The drive came out of the computer belonging to the chair of the Management Information Systems department in the College of Business. And of course, there were no backups taken of any of the data stored locally (which unfortunately was most of the data they needed). Usually you could tell pretty quick if the freezing technique was going to work on a drive - you freeze it and either works or it doesn't. In the case of this drive though, it would start to work and then it would stop. At first I suspected there was a sector that was so corrupt whenever it would try to be read, the whole thing would crash. But after the second or third time freezing it and then starting a copy, I noticed that the copy would error out in different places. One time it might error out after 11%. The next time it might error out at 27%. Something didn't add up. The really unfortunate part was that Spinrite wanted nothing to do with the drive at all. It would start up and immediately hang no matter what.

Since I could get some of the data to copy to different points, I wondered if whatever the freezing technique was overcoming would arise again as the drive warmed back up? To test this theory, I devised a way to keep the drive frozen while retrieving data from it. Remembering how ice and salt interact from my physics classes, I thought, maybe if I could create an environment colder than ice, I could lengthen the time until the drive inevitably crashed again.

I gathered some supplies: an old 5.25" floppy disk storage container, a static resistant bag, an additional bag (make doubly sure not to get the drive wet), some ice, and some table salt. I put the drive in the static bag, the static bag in the other bag, put those in the floppy disk storage container partially filled with ice, put more ice on top of the bags, connected the hard drive to the computer I used to copy files, and then salted the ice (lowering the freezing point, thus maintaining a lower temperature of the ice bath). By doing this, I was able to retrieve far more data from the drive - almost 85% of the data was recoverable. While I was not able to recover everything, there is no way I could have preserved as much of their data as I did, had I not discovered this solution.





Tuesday, May 5, 2020

The Case of the Printer Shortage


To kick-off the first of these posts focused on my past transgressions with information technology, let's take a look back at my first ever "hack".

The year was 1998, and I was a freshman in college; barely a month into my first student employment job as the "lab monitor" for the Honors College computer lab.  This lab, mind you, was the size of a large walk-in closet and was composed of 8 total computers: 6 cow-box era Gateway PCs running Windows 98 (1st edition, not Special Edition), and 2 Apple Macintosh computers running either Mac OS 7 or Mac OS 8.  There were 3 printers in the lab attached to 4 of the Windows Computers by way of two parallel port switch boxes; the other printer was attached to one of the Macs.

The problem with this setup was that only 3 of the computers could print at any given time, and 3 of the computers could not print at all.  Those parallel port switches were analog disasters - you had to remember to switch to either the A or B side, corresponding to which computer you were sitting at, and you had to check with the person at the computer next to you as to whether they were about to use the printer.  Then if the other person was using it, you had to wait until they were done to even send your print job.

Enter Microsoft Windows File and Printer Sharing.

Mind you, I'm not speaking as a security advisor, more as a novice computer user trying to solve a problem. It did not make sense to me to have this lab setup in a way where only certain computers could be used to print. After some research, I discovered that Microsoft File and Printer Sharing could be used to advertise an attached printer from one computer to all of the other computers in the local network - ideally placed into the same workgroup (a precursor to Active Directory that persists on home computers today). This seemed like a no-brainer solution to the problem!

Before messing up any of the computers in the lab, I tested how to enable this on my own computer in my dorm, set it up and had my roommate see whether he could print to my printer - and viola! Out spits his print job.  I then promptly disabled the setting to prevent others from connecting to my computer and doing the same.

The next night, after my shift was over, I went into the lab and created a new workgroup to which I added all of the Windows computers. Next, I went to the two "A" side Windows computers and enabled File and Printer sharing on each of those, giving the printer a new name that corresponded to the computer providing the sharing function and "A" to designate the "A" switch.  Next I went to all of the other computers and added both printers being shared out to all of them so they were available to select. I then added the other printer to each of the computers providing the sharing function, so they too had two available printers.

Lastly, I went to the Mac computers and enabled TCP/IP printing on the Mac that had the printer attached then added it as a printer on the other one.

By doing this, I was able to alleviate many of the stresses that occurred for students - especially for those who were not able to sit at one of the printer-attached computers.  This resulted in less time students spent waiting for a printer to come available, less data lost from having to move files back and forth between computers (to get to one that could print), and less time spent waiting for printer paper to be filled (since they could just print to the other printer instead).