Sunday, September 20, 2020

Resume Advice for InfoSec Job Seekers

 

  • Keep it short and simple
    • Even for people who have worked in the industry for a decade or more, a 2-page (one piece of paper, front-and-back) should be attainable
  • Review and update your resume for every job you apply for
    • People who have the most success with job applications, make sure all of the documents they provide (resume, cover letter, references and job application responses) are crafted and targeted to the position for which they are applying
  • Job experiences should be relevant
    • This does not mean that only InfoSec experience counts; this means each experience should be angled towards showing how you gained or applied InfoSec-related skills within that experience
    • As you prep your resume for each job you’re applying for, think about previous experiences in terms of how they relate to the prospective role
  • Each experience should answer the question: What was your individual contribution?
    • Saying you participated in a project or were on a team is fine, but do not forget to highlight what your specific contributions were to those projects or teams
    • If you collaborated with a couple other people on a single task, focus on the elements you provided
  • Ditch the generic “career goals” section
    • The operative word here is generic. If you are passionate about something and can make this sound like a personal mission that is important to you and is uniquely you, then leave it in
    • If all you have to say is you “want to get a job in InfoSec, hack all the things, and protect stuff,” then at best it’s not doing anything to help you stand out, and at worst, it shows you’re just like everyone else who wants to work in InfoSec
    • Instead - use that real estate on your resume to say something that helps you stand out. Talk about something uniquely you - a group you founded, a tool/script/program you created, a policy/strategy/marketing campaign you came up with, or a personal philosophy that explains your approach to InfoSec
  • Streamline your technical skills, and focus on what’s important
    • It’s 2020 people, and it’s fair to assume everyone has at least passing knowledge of how to use Microsoft Office products. Unless the job explicitly mentions you need to be proficient in Word and Excel, there is no reason to list them
    • Unless the job says Windows or Mac experience is required, take them off
    • Caveat: If a job expects that you are proficient in a specific operating system and can perform command line scripting (as an example), something in your resume should highlight your experience in that area
  • Avoid inflating promotions or title changes into multiple positions
    • Sure, if these were distinctly different roles within the same organization, list them and touch on those unique experiences
    • If you were basically doing the same job the entire time, and had some title changes along the way, either pick the most current one and attach all of your experience to that one, or list all of the titles but consolidate them to a single collective experience
  • Avoid doxxing yourself through your resume
    • If you are posting your resume on sites like LinkedIn or Glassdoor so it can be viewed publicly, you probably don’t want to include your home address and personal cell phone number
    • Keep multiple versions of your resume if you have to - one that you use for public display that says “Contact info available on request” or that just displays an email address; and one that has the rest of the details that you would include with job applications or provide to recruiters
  • Do not put your date of birth or Social Security Number on your resume
    • <Sigh> Just. Don’t.
  • Scale back the details of your education based on your work experience
    • If you are applying for your first job or (especially) an internship, the company may specifically want to know your GPA, otherwise it’s not necessary
    • If you’ve been working in the industry for a number of years, then GPA and graduation year are probably both unnecessary
    • In all cases though, do include the school you attended - large or small. This can become a conversation piece in unexpected ways, and that’s a good thing
  • Keep references separate from your resume
    • This helps to conserve space on your resume and let’s you decide when to provide them (and have more control over who you provide at the time references may be contacted)
  • Highlight volunteer work, regardless of whether it is related to InfoSec
    • Shows involvement outside of work, and your desire to give-back to the community
  • Link to any InfoSec work you do on your own time
    • A great to way do this is to start a blog, which can serve as a supplement to your resume
    • If you maintain an active GitHub of personal work, include a link to that as well
  • Check out my tips for creating cover letters (I still need to pare it down a bit)
  • Other suggestions - less critical than the aforementioned ones
    • Create a designer, one-page resume that focuses more on keywords and eye-catching layout in contrast to more traditional resume
      • This is a good one to carry with you and can hand out at career fairs or conferences
    • Include a section for groups you participate in outside of work and/or hobbies
      • This could also contain memberships to professional organizations
    • Job applications should not just be a copy and paste of your resume
      • While it’s certainly more work, you don’t want to miss the opportunity to share additional information about your work experiences
      • One strategy for this could be to emphasize keywords in the job application, and emphasize work experience in resume
    • Include your social media accounts if they are suitable for professional purposes
      • LinkedIn and Twitter are the typical ones used for this purpose
    • Mention some of the learning opportunities or other activities you have pursued on your own time
      • Local or virtual conferences attended, online classes or other self-taught efforts are all good to mention


Thursday, July 30, 2020

Clearing the Queue


After my brief stint with the bank and watching the financial and housing markets crumble, I returned to the university. While the bank had the bad fortune of continuing to tank after I left (I should point out, I had nothing to do with this), I had the good fortune of being offered a lead position on the university's web presence team. One benefit of the position I was offered was I had some latitude as to what my specific role should be.

After meeting the other folks on the team and listening to their challenges, three specific problems emerged as priority items:
    1. They want to get a handle around the intake of new requests and improve the management of the work in general
    2. They are looking for enhancements to their business continuity and disaster recovery processes
    3. They they need to improve the stability of the website's backend services running ColdFusion (yes, in 2007, people still ran ColdFusion)
All of these were clearly important issues to tackle, and I'm pleased to say we did address all of them, but for the purpose of this discussion I'm going to focus on the 3rd issue, as it was the one that altered the way I approached future problems.

ColdFusion provides a number of services to websites, including scripting, database functionality, server clustering and task queues. It could handle much of this functionality very well, however as the size of the web applications would grow in complexity, ColdFusion would not always scale properly. For us, this presented when the services would freeze and webpages would stop displaying updates. For the most part, the pages would still render, but new content would get hung up between the submission process and the back-end update process. As a result, we would receive calls that content was not displaying properly and then we would "fix" the problem by restarting the ColdFusion services.

One attempt at proactively "solving" this problem prior to my arrival was to create scheduled tasks in the OS to restart the services automatically every hour, with the two servers in the cluster set to restart 1/2 hour apart. This quelled the problem well enough for awhile, but not long after I arrived, some additional problems started to arise from this. A residual affect of these restarts was that the task queue would collect events that may or may not release properly when the services came back up. So over time, this queue would fill up with events that would then overrun the memory pool, which in turn caused everything to then hang. To resolve this issue, an administrator had to go in and manually clear the queue log - to essentially delete the hung events.

Initially, this was happening once a week or so, but as time went on, it would happen more and more frequently. The point at which it was happening about once a day, we knew we needed a better solution than waiting for a phone call to know if the queue needed cleared out.

The initial solution we arrived at was to see if there was a way to programmatically monitor the queue to watch for the number creeping up. When everything was functioning properly, there should be anywhere from a few events to maybe 100 events if you had a bunch of people submitting changes at the same time. Everything would function just fine though until there were 1000 or more events. So we built an ASP.Net app to just render a simple graphic that displayed green, yellow, red, and purple based on the number of events. Any time that we saw it go red, we knew we needed to go in to clear the queue. So the first step was monitoring the queue on screen.

After running this for a bit, and confirming that it was working correctly, we added a function that would send an email alert as soon as the queue hit red. This way we could be alerted after hours without having to manually keep an eye on things. This at least gave us some freedom from having to check the screen several times a day to see how it was doing. Since it was an ASP.Net app, we could at least check it from a cell phone easily. The second step to this process was proactively sending alerts.

Once we got to this point, I asked the question - is there a way to clear the queue without having to log into the console to do it manually? After some research, we discovered that we could indeed call a function from ASP.Net that we could use to clear the queue. We added this function to the app we created and just populated the logic behind a button on screen, such that when we got an alert we could just pull up the app on whatever computer we were near, including our cell phones, and click the button to clear the queue. This was fantastic on multiple levels, as it was far less work for us now and it could be done easily wherever we were. This way too, instead of one of the administrators always having to hop on their computer to resolve the issue, we were instead able to delegate this to anyone to resolve. We wrote very simple instructions that amounted to "If the screen is red, click the button." The third step to this process was to simplify the process programmatically.

The final step in our process, came rather naturally. We had a button we could push whenever we needed to fix the problem, and we were getting alerts whenever the problem occurred. All we had to do at this point was join the two processes together - whenever it would go to send an alert, why not have it also call the function to clear the queue. In theory then, by the time we got the alert and checked the app, the problem should have already gone away. Once we implemented this step, this specific problem was fully mitigated and virtually eliminated. This last step to this process was automation.

Seeing the benefits derived from this approach to problem solving reinforced this as an approach that could be applied for many future problems (so of which I will cover in later posts). To summarize this approach to troubleshooting and problem solving:
  1. Set up monitoring - figure out a way to detect the problem before it occurs by identifying leading metrics that are indicators of the coming problem
  2. Set up alerting - once you've determined how to monitor the leading indicators, further enhance the process (and response times) by alerting folks that actions need to be taken
  3. Simplify the process - break down the steps to take in such a way that all of the logic can happen behind the scenes, and document the process so others can follow it without having to be experts
  4. Automate the process - once you're confident that the process is working consistently and you've defined it in a way that doesn't require expert intervention, hook the alerting and resolution logic together so that it automatically resolves itself
This process has proven successful time and again in the years since. As I've worked with other teams along the way, we have built systems that applied these same principles and gained tremendous efficiency in the process.

Wednesday, July 22, 2020

Creating My OWASIS - Part 3 (Putting the pieces together and wrap-up)


In this third and final post, I will walk through the various components that went into making OWASIS work. In case you missed them, here are the links to Part One and Part Two.

This part of the process was the actual fun part - writing and assembling the scripts into a semi-cohesive package that could be run on a repeated basis to refresh the inventory information. I figured out in Part Two that I would rely on a combination of Bash and Perl scripting to make this all work. There were still a few minor obstacles to overcome

For one, I wanted all of the data output in a consistent manner, and some of the commands to do this would not render properly if they were just called through a remote, interactive session. So I wrote a script that could be uploaded and then run on any of the servers, which I called Remote.sh. This really formed the core of the inventory system and could be run on any server version and would return the data formatted in a consistent manner. The challenge was how to get this script on to all of the servers.

I decided to tackle the Telnet-only systems first. Since Telnet does not support file transfers, I decided to FTP (ugh, yep, not sFTP since that wasn't available) the Remote.sh script to the server first, then call the script from the Telnet session. This worked nicely and returned the information to the terminal.

The next step was to write a script that would automatically login to Telnet and then execute the Remote.sh script that had been previously sent to the user's home directory via FTP - I called this script AutoTelnet.pl. This script incorporated the previously mentioned "exepect.pm" module instructions to handle sending over the username and password (see security disclaimer in Part Two).

The last piece was to essentially build a loader script that would call these other two. Essentially, all this last script for the Telnet systems did was upload Remote.sh and then execute it by then running the AutoTelnet.pl script - I named this script FTP_Remote.sh (for obvious reasons).

For the SSH servers, I still used Remote.sh to run the commands remotely on all of the servers so that I could capture the data in a consistent manner, but since SSH supports file transfers as well, the process of moving the file and then executing it was very streamlined - and it too leveraged the "expect.pm" module for automating the login process.  I called this script AutoSSH.pl.

These scripts collectively represented the real bones of the OWASIS system. I had to write some additional supporting scripts though to make this as fully automated as possible. This included scripts like nslook.sh which I used to perform an nslookup on all valid hostname ranges (the bank named their servers sequentially, fyi). I used listing.pl to parse the output of nslook.sh and determine what systems support SSH and which only supported Telnet. Another script called Parse2csv.pl was used to scrape the output files from the Remote.sh scripts into a comma separated value file.

As I mentioned in Part Two - and looking back in hindsight - there were many security issues present with the way all of this worked. For one, while I played around with making the collection of the username and password interactive for the person running them to avoid hardcoding these values into the files, I still had to use a configuration file (called ftpers.txt) to store these values for running the Ftp_Remote.sh script. If you mis-typed the password in either the config file, or in the interactive prompts though, it would lock the account. This required a call to the security team (plus a mea culpa) to get the account unlocked. And this worked fine for the most part - except for the systems that were Telnet only - because I would not be able to access FTP until a successful Telnet authentication took place. So I wrote another script that I called AutoTelPasswd.pl that was my get out of jail/unlock account script. Let that run that on all of the Telnet servers and I was back in business!

For anyone that has not lost total interest in all of this at this point (anyone? Beuhler? Beuhler?), here are the original instructions I wrote up on how to run OWASIS:

Open-source WAS Inventory Package Instructions

Note: When doing the following, be SURE to use your correct password - failure to do so, WILL lock your account on all of the machines to attempt to log into
  1. Replace "<username>" and "<password>" in ftpers.txt
  2. Run "./Ftp_Remote.sh"
    1. After it has automatically ftp'd the Remote.sh script to all of the server in tn_ips.txt, it will prompt you for a username and password to use to telnet into all of the machines and run the Remote.sh script
  3. Run "perl AutoSSH.pl ssh_ips.txt"
    1. This can be run concurrently with ./Ftp_Remote.sh, as all of the processing is done remotely, so it will not slow down your local machine.
  4. When Ftp_Remote.sh completes, view the log file in an editor that allows you to do block select mode (textpad or ultraedit32), and block select only the first character in every line of the file, and then delete that block. (This way both log files have the same format)
  5. Run "cat SSH_connections-<datestamp>.log TN_connections-<datestamp>.log > Master_Inventory.txt"
    1. This will populate a single file with all of the output from Telnet and SSH servers
  6. Run "perl Parse2csv.pl Master_Inventory.txt > <output file.csv"
    1. I usually make an output file with a datestamp similar to the tn and ssh_connections files
  7. Open your <output file>.csv file in Excel
    1. There will be three disctinct partitions/ranges to the file
    2. Add text labels above the first row in each partition as follow:
      1. Partition 1: Hostname, Brand, Model#, OS Version, Processor Cores, IP Address
      2. Partition 2: Hostname, WAS Versions
      3. Partition 3: Hostname, WAS Home, Server Name, Server Status
    3. Select all of the cells in the first partition/range, goto Data, then filter - advanced filter; check unique records only, and OK
      1. Repeat for each of the three partitions
    4. Copy and paste each partition (text labels included) into its own sheet of a new Excel Workbook
    5. Rename the three sheets in the new workbook as follows:
      1. Sheet 1: Machine Info
      2. Sheet 2: WAS Versions
      3. Sheet 3: Server Info
    6. Proceed with any formatting, sorting, etc. of your choice
  8. If you so choose, now that you have a well formatted Excel "Database" file, you can import this into Access to run queries against - each sheet is equivalent to a table in a database - hostname is the primary key.




Friday, July 17, 2020

Creating My OWASIS - Part 2 (Solving the problem)


In this second part of "Creating My OWASIS", we will get into the approach I took to solve the problem of how to create an inventory of systems for the bank where I worked. If you missed Part One, which provided background and an overview of my role with the bank, you can find it here.

The assignment, you may recall, was to create an inventory of the existing WebSphere Application Servers deployed at the bank. This included identifying all of the development, test, and production systems and their associated versions of WebSphere Application Server, Linux, and certificate information. At a high-level, one approach could have been just manually logging into each individual server, running commands to find the requested information, and noting it in a spreadsheet. Taking this approach, I probably could have completed the assignment in roughly a week or two. And for those two weeks, my days would amount to arriving to work, logging into my workstation, opening up PuTTy, and then walking through the list of hundreds of systems one at a time, picking up where I left off the day prior.

I don't know about you, but I do not have the energy, attention-span, nor desire to spend this many hours of my life wasted in tedium. Fortunately, all of the servers running WebSphere Application Server are a variety of Linux flavors - so perhaps I could write a script to make this process more efficient (and interesting)?

I spent some time brainstorming what was possible and how it would ideally work. My goal was to make it fully automated (or as close to it as possible) - whereby I could feed in a list of servers and it would automatically login, run some commands, and return back the desired information. I knew I could easily accomplish some of this using Bash scripts, particularly for systems that were running ssh, but I found out early on that there were a shameful number of servers still only running <gasp> telnet </gasp> of all things. Well, I wasn't going to let this lunacy slow me down - there had to be a way around this.

I shared my ideas with a friend of mine, and they gave me the suggestion to take a look at Perl, and specifically to look at using the "expect" module. This proved to be exactly the secret sauce I was looking for.

MAJOR CAVEAT - what you are about to read absolutely pre-dates my time in a security role, and while judgement is certainly allowed (encouraged, in fact), this no longer reflects recommendations that I would give today.

There were several ways that Perl was an attractive option for what I was trying to accomplish. The major strength comes from just the sheer number of modules (what other languages call libraries) available that provide a vast array of functionality from which to draw capabilities. Another major strength of Perl is its ability to parse data from either fixed-format or completely unstructured data. This strength comes from how tightly regular expressions (RegEx) are integrated into the language. This makes it tremendously easier to take output and format it into something useable and then import it into another application (Excel, for example). The last strength is of course the one I mentioned earlier - specifically, the expect.pm module - which can be used for automating processes.

The expect.pm module performs the unique function of building what are essentially cases that fire off depending on what is output to the screen. While my plan was to use this specifically to interact with login prompts and prompts to supply passwords (again - not secure), it could really automate anything that involves "if X is returned, then do Y". Functionally, if you are familiar with IFTTT, then you already have a fundamental grasp on how this works.

By combining the power of Bash, Perl, and Expect.pm, I had all the tools needed to create a package of scripts that could automate from start to finish the process of building out an Open-source WebSphere Application Server Inventory System (aka "OWASIS").

Coming soon will be part 3 of this unnecessarily lengthy topic, where I will walk through each of the components that went into package of scripts.

Apologies to Peter Jackson for stealing his creative process.


Thursday, June 18, 2020

Creating my OWASIS - Part 1 (Setting the stage)


During the second-half of 2007, I did a brief stint at a bank - which was exactly the wrong time to be starting a career at a bank. As you may recall, this was right when the mortgage crisis was beginning to come about, and it lasted into 2009. While I was only there for 6 months to the day, I got to watch the stock value plummet to ~10% of what it was when I started. Within the next couple years long after I left, the bank was purchased by another bank and MANY people lost tons of money on the deal.

How is any of this relevant to my next story?

I was originally brought in (along with one other person) to be part of a new group within the Middleware Management and Support team, specifically to assist with doing research and development into new uses for WebSphere Application Servers. My role was to assist with the development of systems that would improve the throughput of EFTs (Electronic Funds Transfers) between the mainframe and the downstream ATMs and Web endpoints. However, just as I was about to join the team, this project went on hold due to the IT Architecture team deciding to begin development using Weblogic instead. With this, my role immediately changed from building and testing WebSphere in new applications, to just maintaining the existing WebSphere systems like the rest of the team. The problem was, there was already a team of people that supported the existing WebSphere environment, and between the shift in technology focus away from that team and the sudden downturn of the stock market, they were reluctant to want to show the new guys the ropes.

Humans have an innate sense of impending doom which fires up long before the rational part of the brain realizes what is happening. This then engages the fight or flight response in order to preserve oneself. The way this was demonstrated within my team, was relegating myself and my other new coworker to the most basic of tasks, and barely lifting a figure to get us pointed in the right direction. They were afraid training us would train themselves out of their jobs; in hindsight, they were probably correct.

I was literally given one real, personally-assigned project to work on independently, and this was to create an inventory of the existing WebSphere Application Servers. Mind you, the WebSphere servers numbered in the hundreds, and people had long since lost track of what they all were, which ones were still in use, and basically what was even still powered on. Being the resourceful type, and also BORED OUT OF MY MIND, I decided to think of ways that I might be able to automate the process and - most importantly - save myself time if I ever had to do this again.

I've always said that the best programmers I've known are the laziest. This may sound counterintuitive on the surface, but in reality, it is the fact that they are lazy that they seek ways to avoid having to perform repetitive tasks. Logging into hundreds of servers and running a command to see if WebSphere is installed, then document the version number, is the pinnacle of repetitive tasks.

Fortunately, this assignment (which I assume was just intended to be busy-work keeping me occupied anyway) came with no instructions for how they wanted it completed, nor a deadline for when it was to be completed. And so, I took this as an opportunity to learn some new skills and create the best damn inventory process possible.

Continued in Part 2...

Monday, June 8, 2020

Kickstarting Knoppix




&



Those of you who were interested in running Linux in the early to mid-2000s ("20-aughts") may remember a clever distribution called Knoppix that allowed you to run Linux on any computer using a bootable CD or DVD. This distribution was used to form the basis of several others: Helix - used for computer forensics; Kanotix - which added a feature to perform hard drive installs of Knoppix; and the still-popular Kali (originally BackTrack) - which is widely adopted by penetration testers.

The beauty of Knoppix, as mentioned earlier, was that it could run fairly reliably on almost any equipment of that era. This made it an attractive option in cases where it was difficult to predict what equipment you may need to run it on. For this reason, I wondered if this would be a beneficial option for Disaster Recovery processes. The specific problem that I was looking to solve was: "how do we start rebuilding our many RedHat servers from bare metal in as efficient a manner as possible?"

RedHat developed a utility called KickStart that was essentially a local, network-based, RedHat distribution mirror. In it, you could store a customize set of RPMs along with other packages and programs that you wanted to deploy onto your new servers. For general administration processes, we set up a KickStart server for building new servers from scratch fairly easily. And because it resided on the local network, we never had to worry about download speeds; the bandwidth limitation was just the NICs on the servers and the switches within the data center. This worked great on the equipment we had in place, however, it did not work as well in a disaster recovery setting as RedHat had a tendency to be fussy when you tried to restore an image from a different model of hardware. Based on the contract we had with our vendor, we were guaranteed the same or better grade of equipment, but not identical equipment (which would have been much more expensive). This created complications with restoring servers from backups in general, let along restoring our KickStart server. We needed to figure out a way to bring up the KickStart server without having to fix a bunch of driver issues at the time of recovery in order to expediently build the other servers.

Enter Knoppix.

Even though Knoppix was based on Debian which is significantly different than RedHat, the underlying technology (the kernel, runtime environments, etc.) were very much the same. Since KickStart runs off of a web server, all I needed to do was install and enable Apache on Knoppix and configure a site that references the location where I would store the RPMs and KickStart config file. I then configured it to make sure Apache came up automatically upon boot by updating init.d.

Knoppix provides a process by which you can set persistence on changes made to the Knoppix environment and then save it as a new iso image. After doing this and confirming with a new test disk that it would boot and run correctly, I ran into my next hurdle - storage.

A standard high-density CD has roughly 650MB of useful storage capacity, and the Knoppix base image took up nearly all of it, leaving barely any room for dropping in the set of RPMs we needed. Fortunately, some of the newer releases of Knoppix at the time (version 4.0.2, specifically) were capable of running from a DVD as well. Unfortunately, Knoppix was taking advantage of this added space with a bunch of additional programs that we certainly did not need. So I got to work, removing and uninstalling as many of the programs as I could while maintaining the minimum necessary to bring up the KickStart server. I had to make a few concessions - for example, KDE was WAY too bloated for our needs, but we still needed an X Window System environment, so I replaced KDE with Blackbox, keeping just the KDM window manager for the backend. Doing so freed up just enough space to fit everything.

One last configuration I made before packing it up and burning it to disk was to set the network to use a static IP. This way, once the KickStart server boots up, we can stand up the new servers on the same network as the KickStart server, initiate the KickStart install process using network install or PXE install, and off it goes!

Having this option available in the event of a disaster gave us assurances that we could quickly and easily bring up a KickStart server which could then be used to perform bare metal installs of all of the servers within our environment.


Saturday, May 23, 2020

We're Out of Serial



One of the more frustrating and stressful events I will share in this blog (to-date anyway) involved the Sun Solaris 10 SPARC systems that were running the Oracle database server for our ERP system.

By this point in my career, I was hired into my first full-time computer-related job as a Systems Administrator in the central IT department. I was given the assignment of being the primary SysAdmin for the ERP implementation project, and I had approximately 4 weeks plus a training course worth of experience running Solaris 10 servers prior to this. Needless to say, I was in a bit over my head.

The event that occurred stemmed from the Sun cluster that was used to provide high-availability for the database - where there were two servers in the cluster and the database could failover between servers with a floating IP address that would move with the active instance. Mind you, this was entirely new architecture within IT, and I had one other coworker who had some semblance of Solaris experience, though none beyond version 9. Cool.

Some details of this event are a bit foggy after 14 years, but I will share what I can.

The first indication there was a problem was the phone ringing on my desk concurrent to instant messages blowing up on my computer. They were developers and a DBA from the ERP project team, and it seems they were no longer able connect to the servers. Not good. I asked them if they were using the right IP address (yes, for whatever reason, these folks never took to using DNS names, and they had a habit of pointing to the static IP rather than the float IP). They assured me they tried all of the IPs (which I bought, because normally they would just say they were using the correct one and it would magically start working once they had me on the phone).

I tried connecting myself - to no avail - and determined that indeed there was an issue; it was particularly telling that ALL of the static IPs and the float IP were timing out across both servers. This was very not good.

There is something you need to know about Sun servers - when you first bring them up, you can either connect to a local serial port for the initial boot and setup, or you can use DHCP to automatically assign an IP and connect remotely. We used DHCP, so it was easy enough to setup from a remote session. Unlike any other server in our data center though, these did not have a VGA connector into which you could plug in a monitor and keyboard when remote access fails. Typically though with Sun server, once you got them set up that first time, you never needed to touch the serial port again; they were pinnacles of reliability.

Until they weren't.

It was about this point that I could feel the color leaving my skin, as I had never had to touch the serial port on these before. Did we even have the right connector for them? Possibly. Did I know how to use it? NOPE.

Sun would ship one fancy metal rs232 db9 to rj45 connector per server. And since our ERP ran on 10 or so Sun servers, we fortunately had plenty of the db9 to rj45 connectors around. Easy enough, find a standard cat-5 patch cable with rj45 terminations on both end, plug in one of the db9 to rj45 connectors on each end, connect one to the server the other to your laptop, and everything's peachy.... Not so fast. Plug all this in together, and there is nothing on screen - how could this be? Maybe one of the connectors was faulty? I tried several others, no dice.

A coworker had a usb to serial connector (don't ask why, no clue), so I asked to borrow that. Plugged it in, fired up Hyperterm and I got mostly gibberish. Sure there were characters on the screen, and I could type, but nothing was happening. It did not seem to recognize any of the commands I sent. It was progress, but only barely. This is not good. My suspicion was that the pin-out for the Sun server and the expected transmission pins in the USB to serial connector were not the same, and Hyperterm was not sophisticated enough to renegotiate that serial connection to make it work.

(Quick aside - looking back on this with significantly more experience, there were probably changes I could have made in Hyperterm to get that USB connector work properly. Oh well, didn’t happen.)

Why wouldn't Sun just ship a db9 to an ANYTHING YOU COULD FRIGGIN' PLUG INTO A LAPTOP connector???

After a bit of research (aka, Googling frantically for any lead I could find), I found a pin-out table for the Sun db9 connector (Like this one). That's when it dawned on me - it's the same problem you have with network cables! You cannot connect just a standard patch cable from one server to another, you have to use a cross-over cable so your transmit and receive pins line up properly.

For the next 45 minutes, I found myself cracking open the case of one of the db9 connectors, pulling out all of the pins, and then matching opposing pins to the associated cross-over equivalent. I plug one end of this Frankenstein cable to the server, the other end into the serial port of my laptop, fire up Hyperterm, and up appears a scrolling list of errors (recognizable as some of the previous gibberish) and this time, I could send commands and receive responses - amidst the scrolling errors. I send a simple "init 6" command to each of the servers, and when they come back up the issue seems to have resolved. Still no clue as to what caused the cluster to lose its mind, but at least I now had a cable to use should the same fiasco happen again.

Thursday, May 21, 2020

Time for a Reboot


It's time for a recharge and a reboot.

Similar to how computers that are left running at a constant clip get bogged down with memory clogs and orphaned processes and threads, so too does the human body. It can get stuck in a rut sometimes just going through the motions as if on autopilot, all the while wearing down with years of pent-up stress and frustration. Every once in a while a recharge and a reboot becomes necessary.

This is mine. I am considering this both the end of my past and the beginning of my future.

I have spent too many years mired in the culture, constraints, and complacence of my relatively stable career. I'm generally happy with what I do. I definitely enjoy the team of folks with whom I work. But a lot has happened over the past couple months that have brought new perspective to who I am and what I am capable of doing.

There is a fork in the path ahead of me. One direction provides relative certainty as to where it ends. Sure there are a few bumps along the way, especially in the portion closest that is easiest to see, but the horizon has a fairly reliable destination that is not terribly unpleasant. However, it lacks adventure and excitement and that is partially because it is also relatively low risk. It's like walking on a paved bike & hike trail.

The other path is rugged and unsteady, and has boulders in place of the bumps. This path is also much more difficult to see where it ends up, or how many twists and turns it will have along the way. There may be unseen perils lurking just out of eye-shot. There may be places where it's hard to even tell where the path picks up on the other side. And there may be drop-offs along the way, where one bad step could lead to catastrophe. This path also has mystery about it. Perhaps there are even uncharted portions where I would become the first one paving the way. Will it lead where I want to go? Perhaps. More than likely, it will lead wherever I am willing to take it - and that may require having a larger appetite for risk and almost certainly will require more work.

"Why," you might ask, "would you be willing to sacrifice the more certain path?" The answer is simply: because it is tiring and tedious. I've been running along this path a long time. I pass the same people day-in and day-out, and they are all great people, and they cheer me on along the way. My body is so accustomed to walking along this path, it's hardly even a workout any more. I am craving something new. I am craving challenges that I have never had to face. I am craving meeting new people and building new connections. And I am craving new sights and sounds to tease my senses.

Could choosing this new path be a giant mistake?
Perhaps.
Is there a tremendous amount of risk in doing this?
Almost certainly.
And am I making a giant wager in this choice?
Absolutely.

But I am betting on myself. I know that I am capable of doing more. I know that I have been the victim of circumstance along the way. And I know that I deserve to get as much enjoyment and fulfillment out of my career as many others have achieved for themselves.

I worry that my outlook is naive and short-sighted; that even though I think I know what risks lie ahead, they are actually far greater than I can imagine. I also worry that I take for granted the stability that I have been afforded by traveling along the same old path. And I worry that I will quickly regret this decision. I may. And I may very well find that, once again, the grass really is no greener anywhere else. Maybe so. Maybe not. I certainly won't know if I don't give it a shot. I just hope that if anything does go south, that I can find enough of a safety net to still catch myself. That is my greatest fear in all of this.

If I had more connections that I could lean on to steer me in the right direction and help catch me if I were to fall, this would not be as difficult a choice. But you see - it's for exactly that reason, that I must take this leap of faith. I will never build those connections if I stay where I'm at. I've been kept locked up for far too long. Well, not any more.

Hello World!

As I set out on this new path, grabbing my trail shoes and walking stick for the treacherous path that lies ahead, I look into the great unknown with nervous anxiety, anticipation, wonder, uncertainty, thirst, fear, and excitement.

I just hope I'm making the right decision.

Sunday, May 17, 2020

Cracking My Own Cuckoo's Egg


While not nearly as fantastical and incredible as the true story Cliff Stoll shared in "The Cuckoo's Egg," I too had the opportunity to experience a similar thrill in catching a "cyber criminal" who was up to no good.

This story took place while I was a college senior working for the College of Business Technology Support Center. By this point I had risen to the rank of Lead Student LAN Administrator, and was in charge of leading the other tech support student workers as well as building and managing some of the servers used within the college. In our sizable storage room, we had some space set aside where I and some of my coworkers could build development servers for independent learning.

One evening, I received a message from one of my coworkers who was out of town for the weekend saying that he thinks someone else had broken into the server he had built. When he built the server, it was running Windows Server 2000, however when he went to RDP into it, he found that it was now running Windows Server 2003 - and he swears he did not install 2003. This seemed odd, especially considering if someone broke in remotely, they would have to figure out a way to load the media, perform the install and ensure they could reconnect to the system afterwards. I found a few articles that suggested this could be done if they were doing an in-place upgrade, so the story seemed plausible enough.

Unfortunately, whoever had taken possession of my coworker's server had also removed their access to the server. Without the ability to login to the system, it was going to be more difficult to figure out what was going on. Additionally, if this was someone local (which was my hope) I did not want to tip them off that I was on to them, figuring that if they found out, they would wipe the system of anything that could trace back to them. How could I passively recon information and evidence without tipping them off? Sure, I could just disconnect the server from the network, comb through the logs and hope there was evidence that pointed back to them, however I really wanted to catch them in the act - but how?

Several things came to mind.

First, I ran Nmap against the system to see what ports were open in order to get a sense of what they might be doing with the server. The most interesting to note from this was that port 194 was open and appeared to be running some form of an IRC server. My coworker had not been using this for IRC - this was new. Aside from that, the rest of the ports were typical windows system ports, including RDP (3389) and SMB (139 & 445).

This at least gave me some direction on what to look for, if I could sniff the traffic. While it was not connected to a switch from which I could perform traffic mirroring, since I had physical access to the system, I could insert a device between it and the wall from which I could sniff the traffic. Network hubs were more commonly used back then, and we fortunately still had a few unused ones laying around. The beauty of a hub as opposed to a switch, is that a hub blindly sends copies of all the traffic to all of the interfaces (as opposed to a switch which only sends the traffic to the appropriate destination).

So here was the next plan - disconnect the cable from the wall, connect it to the hub, then connect another cable from the hub to the wall. Next, I would grab a laptop, install WinPCAP and Wireshark (this was before I became proficient with Linux), and connect that to the same hub.

Once I had everything setup, I needed to be quick in switching out the cables - again, to avoid tipping them off. And viola! Just like that I could see all the traffic flowing to and from the server. Working in my favor was that IRC sent all of the messages in clear text, so I could see all of the conversations streaming across the wire. This pretty much gave me everything I needed to start tracking down this pest. I ran capture after capture of the traffic to see if I could tell the source IP address for the person in control of the server.

I noticed pretty quick that one IP, which happened to originate from an on-campus residence hall network (per the DNS name), was consistently connecting to the IRC port. While not conclusive, it was a lead. The real break though, came after a few days of monitoring other activity originating from that IP. I happened catch a connected session established from that same IP to the RDP port. A ha! This was my person. After a quick call to Network Services to trace the IP, they were able to provide me with the switch it was associated to, and the associated room number into which it was patched. So I had an IP address and a residence hall room number, all I needed was a name.

I find that it pays to have friends in many areas and departments on campus, including Residence Services. I reached out to my friend there, who happened to be a fellow student employee that helped manage the housing database, to see if they might be willing to look up some information for me. They had no issue with that - while not publicly available, this was directory information that did not require consent or approval to hand out. I told him the room number, and he gave me the name - and wouldn't you know it! The occupant of that room was a student who had previously worked in the College of Business Technology Support Center. Gotcha!

I gathered up all of the information I collected, wrote up my findings, and preserved the evidence from Nmap and Wireshark and turned over my findings to my boss. He promptly contacted the University's recently formed information security office so they could take action. Following a conversation with the student about how even as innocent in the greater scheme of things as this was, it was still considered breaking the law - and brought along the threat of expulsion - the student wisely confessed, apologized for their mistakes, and agreed not to misuse any more university computers or networks. Getting to see their embarrassment for getting caught was satisfying enough for me.




Wednesday, May 13, 2020

Frozen Bytes


One trick I learned early in my computer career served me well on numerous occasions: freezing hard drives.

While this certainly did not work every time, for many issues with the older platter-style hard drives, simply putting it in the freezer for anywhere from 3 to 24 hours could temporarily resurrect a drive. Common ailments this could treat included the dreaded "click of death", the "unrecognized media" errors and spontaneous system faults (which typically occur leading up to the other two).

Often what I found worked best was freezing the drive, then running the Steve Gibson program, Spinrite, on it to detect problem sectors, copy the data to functioning sectors, and then flagging the problem sectors so they were not written to again. At this point you could either put the drive back into service (not recommended), or if it continued to exhibit problems, copy the data to a new drive and dispose of the old one. I found using either the Windows safe mode or a bootable BartPE disk would work best for copying the data. Since it was mounted using just a lightweight operating system, fewer resources were consumed, and it was least impactful on the drive itself since the operating system would not run processes to index all of the files.

One drive I dealt with was particularly problematic.

The drive came out of the computer belonging to the chair of the Management Information Systems department in the College of Business. And of course, there were no backups taken of any of the data stored locally (which unfortunately was most of the data they needed). Usually you could tell pretty quick if the freezing technique was going to work on a drive - you freeze it and either works or it doesn't. In the case of this drive though, it would start to work and then it would stop. At first I suspected there was a sector that was so corrupt whenever it would try to be read, the whole thing would crash. But after the second or third time freezing it and then starting a copy, I noticed that the copy would error out in different places. One time it might error out after 11%. The next time it might error out at 27%. Something didn't add up. The really unfortunate part was that Spinrite wanted nothing to do with the drive at all. It would start up and immediately hang no matter what.

Since I could get some of the data to copy to different points, I wondered if whatever the freezing technique was overcoming would arise again as the drive warmed back up? To test this theory, I devised a way to keep the drive frozen while retrieving data from it. Remembering how ice and salt interact from my physics classes, I thought, maybe if I could create an environment colder than ice, I could lengthen the time until the drive inevitably crashed again.

I gathered some supplies: an old 5.25" floppy disk storage container, a static resistant bag, an additional bag (make doubly sure not to get the drive wet), some ice, and some table salt. I put the drive in the static bag, the static bag in the other bag, put those in the floppy disk storage container partially filled with ice, put more ice on top of the bags, connected the hard drive to the computer I used to copy files, and then salted the ice (lowering the freezing point, thus maintaining a lower temperature of the ice bath). By doing this, I was able to retrieve far more data from the drive - almost 85% of the data was recoverable. While I was not able to recover everything, there is no way I could have preserved as much of their data as I did, had I not discovered this solution.





Tuesday, May 5, 2020

The Case of the Printer Shortage


To kick-off the first of these posts focused on my past transgressions with information technology, let's take a look back at my first ever "hack".

The year was 1998, and I was a freshman in college; barely a month into my first student employment job as the "lab monitor" for the Honors College computer lab.  This lab, mind you, was the size of a large walk-in closet and was composed of 8 total computers: 6 cow-box era Gateway PCs running Windows 98 (1st edition, not Special Edition), and 2 Apple Macintosh computers running either Mac OS 7 or Mac OS 8.  There were 3 printers in the lab attached to 4 of the Windows Computers by way of two parallel port switch boxes; the other printer was attached to one of the Macs.

The problem with this setup was that only 3 of the computers could print at any given time, and 3 of the computers could not print at all.  Those parallel port switches were analog disasters - you had to remember to switch to either the A or B side, corresponding to which computer you were sitting at, and you had to check with the person at the computer next to you as to whether they were about to use the printer.  Then if the other person was using it, you had to wait until they were done to even send your print job.

Enter Microsoft Windows File and Printer Sharing.

Mind you, I'm not speaking as a security advisor, more as a novice computer user trying to solve a problem. It did not make sense to me to have this lab setup in a way where only certain computers could be used to print. After some research, I discovered that Microsoft File and Printer Sharing could be used to advertise an attached printer from one computer to all of the other computers in the local network - ideally placed into the same workgroup (a precursor to Active Directory that persists on home computers today). This seemed like a no-brainer solution to the problem!

Before messing up any of the computers in the lab, I tested how to enable this on my own computer in my dorm, set it up and had my roommate see whether he could print to my printer - and viola! Out spits his print job.  I then promptly disabled the setting to prevent others from connecting to my computer and doing the same.

The next night, after my shift was over, I went into the lab and created a new workgroup to which I added all of the Windows computers. Next, I went to the two "A" side Windows computers and enabled File and Printer sharing on each of those, giving the printer a new name that corresponded to the computer providing the sharing function and "A" to designate the "A" switch.  Next I went to all of the other computers and added both printers being shared out to all of them so they were available to select. I then added the other printer to each of the computers providing the sharing function, so they too had two available printers.

Lastly, I went to the Mac computers and enabled TCP/IP printing on the Mac that had the printer attached then added it as a printer on the other one.

By doing this, I was able to alleviate many of the stresses that occurred for students - especially for those who were not able to sit at one of the printer-attached computers.  This resulted in less time students spent waiting for a printer to come available, less data lost from having to move files back and forth between computers (to get to one that could print), and less time spent waiting for printer paper to be filled (since they could just print to the other printer instead).



Thursday, April 30, 2020

Old Blog; New Content




Returning after an extended hiatus, the Top O' The Morning blog is back, but will no longer focus on current events in the news.

Beginning next month, May 2020, the Top O' The Morning blog will start featuring some of the unique technology-related hacks, fixes, solutions, and creations that I have worked on over the past couple decades.

As a sneak preview, this content will be covering topics such as: cryogenically freezing hard drives to resuscitate them, bootable RedHat Kickstart servers, automating inventory with Perl, using Google Sheets to scrape websites, and monitoring beer fermentation with an Arduino.

Buckle your seat-belts, folks, as I take you on a ride down memory lane!