Secure, Managed File Transfer

How to Tell if You Need Secure, Managed File Transfer

If your organization is like most, you probably move more than a few files from place to place. There are probably several—if not hundreds of—vendors or business partners that you transfer information to, perhaps in an XML file or even a comma‐separated values (CSV) file that was exported from a spreadsheet or database. Some of these file transfers may happen on a regular, scheduled basis; I'm betting that more than a few of your organization's file transfers happen on‐demand, in a more ad‐hoc fashion. You probably move files within your organization as well, either between departments or perhaps even between divisions.

One reason you may have started reading this book is the word "secure" in the title, although the word "managed" may have pulled you in, too. More and more organizations are grappling with that "security" word these days, either because they're simply tightening their own internal controls over their corporate information or because they're subject to industry requirements or legislative requirements that force them to secure and audit certain types of information. As more data flies back and forth across our private and public networks, we have good reason to become more concerned about who else might be reading that data—hence the focus on security.

If you've read the previous two paragraphs and thought, "Yes, this is my organization," then you've found the right book. I wrote this book specifically for organizations that need to move data from place to place, and need to do so in a secure, auditable, managed fashion, no matter what kind of data they're moving.

What Is "Secure, Managed" File Transfer?

This is a good time to get some terminology definitions out of the way. Many of the terms I'll be using in this and upcoming chapters are used in a variety of ways by a variety of different people, so I should spell out exactly what I mean by those terms so that you're not mislead.

File transfer is a generic term that applies to the act of transmitting data over a computer network—either a private network or a public one like the Internet. Notice that I wrote transmitting data rather than transmitting files; a file is really just a container for data, and it's not the container we're usually concerned with—it's the data within. In any event, any transmission of data more or less ignores the container—data is just a stream of bytes. Managed is one of those vague terms that can mean a lot of different things. In the context of managed file transfer, it usually refers to software solutions that are designed to facilitate file transfer. There's even a commonly‐used acronym: MFT. MFT solutions provide a "wrapper" around file transfer techniques and protocols, and that wrapper allows them to do things like schedule and automate file transfers, report on file transfer activity, measure file transfer performance and other metrics, and so on.

Secure file transfer is another overloaded term that people think they know the definition of. Secure file transfer is often a component of managed file transfer; the "secure" part usually refers to a bunch of specific capabilities, including:

  • Encryption. This is what most people think of when they see the term "secure file transfer," and it refers to the ability to encode data in such a way that only the sending and receiving parties can view it.
  • Auditing. This is an aspect of security that fewer people tend to think of right away, but it's an aspect that's becoming more and more important. It refers to the ability to track every activity associated with file transfer, such as who sent a file, when they sent it, who received it, when it was received, and so on.
  • Non‐repudiation. This refers to the ability of a file transfer system to ensure and prove that a file was received by the correct recipient.

There are other elements of security that we'll explore throughout this book, but this short list will do for now.

Obviously, you need software in order to achieve secure, managed file transfer; software has to provide the capabilities above and beyond those associated with simply streaming bytes across a network. In part, this book will be about the capabilities that you'll commonly find in MFT software solutions, so that you can do a better job of evaluating and selecting the right solution for your environment. If you don't really see yourself as a user of an MFT software solution, you may change your mind; Gartner has said that:

Numerous factors cause companies to re‐examine how they manage the movement of information from system to system, partner to partner, and person to person. FTP alone is not a viable option to give organizations the insight, security, performance, and, ultimately, the risk mitigation necessary to responsibly conduct business.

In other words, if your business is moving data, you probably are a potential user of an MFT software solution, whether you realize it or not. This chapter, in fact, will be a sort of test for you—if by the end, you're sure that none of these scenarios apply to you, then you're probably not going to be using an MFT solution, and you can stop reading this book (but wait until after Chapter 2, which is going to be really fun).

So now that we have a common vocabulary for secure, managed file transfer, where do you use it?

File Transfer Scenarios

Businesses typically have three scenarios in which they move data from place to place. It may seem a little redundant to talk about why you move data around, but this is actually an important place to start: Why you move data is where we'll find the specific business capabilities that you need. For example, if your business never transfers data on a recurring basis, then you may not need the same automation and scheduling capabilities that another business requires. So let's look at three basic scenarios as well as a couple of minor variations within each. Keep track of which scenarios apply to you.

Regularly Exchanging Files

The first scenario I always think of is moving data on a regular schedule between external business partners. The main reason my mind goes to this scenario first is that it has, quite honestly, been a major pain throughout much of my IT career. Years ago, I worked for a "dot com" that was a virtual retailer. In other words, we sold stuff that we didn't actually have. When we received customer orders, we transmitted those orders to the vendors that stocked those products, and the vendors drop­shipped the products directly to the customer. We had a very strong need, then, to regularly transmit order information to a huge variety of vendors, all of whom seemed to have different formats and protocols that I had to figure out.

Nowadays, my job would have been even more difficult. Because our customers invariably paid by credit card, we would have been subject to the Payment Card Industry Data Security Standard (PCI DSS), which outlines some pretty specific technical requirements for how we would have to handle customer data such as addresses, phone numbers, and so on. Our file transfers to vendors would have to be not only automated but also secured so that the customers' data wouldn't be revealed to anyone else.

I also had to worry about receiving files on a schedule, as our vendors transmitted invoicing information to us that way, although those didn't include any sensitive information. I not only had to have a place for them to send files—which would have been easy, because an FTP server would do the trick—but I had to watch for incoming files, grab them, and feed them off to a batch process that would import the invoice information into our accounting system. I needed more than just a simple FTP server, in other words: I needed a workflowbased automation system, or invoices wouldn't get paid and our vendors would soon stop dealing with us.

But external partners aren't the only instances when files are transmitted on a regular, recurring basis. In another job, I had to help coordinate the movement of data between an AS/400 and a Unix‐based warehouse pick system. The AS/400 received sales information from hundreds of retail store locations and had to generate restocking orders for those stores. The restocking information had to be transmitted to the Unix warehouse system, which used a system of digital indicators to tell warehouse workers which products needed to go into which box for shipment to each store. Being a Unix system, it wanted mainly to deal with file transfer via common FTP, but we needed a very controlled, managed process to be sure the information got from one computer to the other. The Unix system would also transmit exceptions back to the AS/400—information on out‐of‐stock products, for example—so that the AS/400 would know that a particular store hadn't yet received a particular product and could be rescheduled for shipment when more product was received in the warehouse. The whole two‐way transfer of data via a fairly simplistic protocol like FTP was a real nightmare in the beginning, and it's one of the things that first set me looking at MFT software solutions.

So this is the first of three scenarios: Recurring, automated file transfer. You'll also see this referred to as system­to­system transfer because data isn't being transmitted between individual people but is instead being transmitted directly from computer to computer. In addition, some automated processes are running on each computer to either produce the files being sent or to process the files being received.

Occasional or Ad‐Hoc System‐to‐System Transfers

Another type of system‐to‐system transfer—again, not involving human beings—is the kind that doesn't occur on a regular basis but instead happens more ad‐hoc, meaning that the parameters of the transfer are specified right when the transfer is made rather than in advance. I did this a lot, too, in former jobs. The dot com, for example, would sometimes need to add an extra set of orders for a specific vendor after our normal daily file transfer. That happened a lot during the holiday season when the order volumes were higher and the vendors had earlier cutoff times. The retailer I worked for would also need occasional ad‐hoc transfers to the Unix system, as that's how we would download new warehouse maps and other data that didn't change very often.

Although these ad‐hoc system‐to‐system transfers needed the same kind of security and management as the scheduled transfers, we found that we interacted with the file transfer system in an entirely different way. Rather than an administrator like myself setting up the transfer schedule in a back‐end management console, we found we needed a user interface (UI) that a less technically‐skilled user could operate. At the dot com, for example, ad‐hoc late afternoon transfers were usually set up by someone in our order‐management team because by that point in the day most of the IT staff was gone or were busy with other projects. At the retailer I worked for, new warehouse maps were created by the warehouse manager, and he didn't like having to wait on the IT staff to get to the file transfer—he wanted to be able to update that Unix system as soon as the new warehouse map was complete.

Another wrinkle was introduced when the warehouse manager started delegating the warehouse‐mapping duty to one of his subordinates. He wanted them to set up the file transfer to the Unix system. However, before the transfer actually happened, he wanted to review and approve the new maps. So we had to somehow create a UI that would accommodate that business workflow: accepting a file transfer order but holding it until approval was received from a specific individual.

So the second of three scenarios, ad­hoc system­to­system file transfer, includes requirements for different kinds of UIs. In some cases, you may also have additional auditing or even workflow requirements.

Ad‐Hoc, Person‐to‐Person Transfers

The last file transfer scenario involves people. Rather than transferring files from system to system, this scenario has people transferring files to each other. I did some consulting work for a hospital, and this was the most common type of file transfer there. Administrators would transmit patient records between departments within the same hospital and would transfer records between the hospital and external specialists like cardiologists, neurologists, and so on. Sometimes, they would do hospital‐to‐hospital transfers of records, when two doctors at different hospitals needed to consult on a particular patient.

You probably won't be surprised to learn that a lot of those transfers, at least when I started working with them, were done via email. In the next chapter, I'll spend some time explaining why email is a horrible idea for this kind of transfer. At the time, the hospital I worked with was just starting to implement their Health Insurance Portability and Accountability Act (HIPAA) requirements, and they had just figured out that email wasn't going to do the trick.

Ad‐hoc, person‐to‐person transfers are tricky to deal with from a business perspective. You have to provide a way to accomplish them; otherwise, users will just use email attachments. If you restrict the size of attachments to make that option unworkable, they'll start using sites like http://drop.io, which is even worse.

What we eventually figured out is that the hospital needed a system that could essentially do "system‐to‐person" transfers of files, using full auditing, encryption, and the other fun stuff that HIPAA required. That system needed a simple UI so that administrators could easily initiate transfers from their desktops, feeding the required file or files into the system and letting it take over and actually send the file to the destination. In the end, it felt to the end users like a person‐to‐person transfer: They went to a Web page on their intranet, specified the files they wanted to send and the recipient, and the system took over from that point and made sure it all happened.

Ad­hoc, person­to­person transfers are the third major file transfer scenario that businesses see. Which of these three scenarios are occurring in your business?

Business Needs for File Transfer

People like us just need data moved from place to place; we often don't think much about the process beyond that simple requirement. But from a business perspective, there are a lot more things to worry about as data starts flying around on the private network or, even worse, flying around on the public Internet. In the next several sections, I'll discuss the details of the most common business needs that accompany file transfers. For each, I'll outline scenarios where I've run into that need in the real world and help highlight considerations that are often overlooked, even by large companies that have experience in this area.

Meeting Internal Requirements

These days, everybody likes to focus on "compliance" as the driver for all things securityrelated. Whether you're talking about PCI DSS, HIPAA, the Gramm‐Leach‐Bliley (GLB) Act, the Sarbanes‐Oxley (SOX) Act, or federal government requirements, there seems to be no end of "compliance" requirements. And those examples are just in the US!

Companies have long had their own internal reasons for securing data. Simple internal compartmentalization is one reason: You might not, for example, want everyone in the company to have access to personnel records, salary charts, and so forth. In most companies, financial information and bookkeeping records are often considered confidential.

Concerns about corporate espionage also drive internal security concerns. The concept of data leakage—which is a really nice way to describe employees sharing data with those that they shouldn't—is always a concern, especially in companies whose business relies heavily on intellectual property that can't be adequately protected through mechanisms such as copyrights and patents.

Despite the growth in external requirements from legislative bodies and industry groups, internal requirements remain a strong concern for most corporate executives. File transfer is a key area in which data can be moved outside the organization, improperly disclosed, and used to potentially damage the business. Even accidental disclosure can be damaging, such as a case I had with one past employer. It actually wasn't the employer's fault; an employee improperly disclosed confidential personnel records for an employee that had recently been terminated. The terminated employee had applied for a new position with another company, and a person at that new company called a friend who still worked for ours, looking for information on their potential new hire. The friend dug around and provided that information—which resulted in the other company not hiring the person we'd fired. That person, of course, sued the heck out of us.

Our company could have defended itself against this by placing better controls on file transfer, but that's not really the first issue. After all, the information could have been divulged over the telephone or fax just as easily—except that both phone and fax were tightly managed, and using either would have created a trail of evidence leading back to the person who broke company policies about handling personnel records. If file transfers had been managed half as rigorously as phone and fax records, the offending employee could have been caught—and the company might have been able to deflect some of the legal damage onto the person who was actually responsible. As it was, we couldn't prove anything.

This is a big area where managed file transfer is intended to help: By not only placing some restrictions on who can send what, but most importantly by auditing what is sent, by whom, at what time, and to where. That audit trail can prove invaluable both for internal forensics as well as in legal defense, if it's ever needed.

Meeting External Requirements

I'm betting this is where you're expecting me to roll out the alphabet soup of industry and legislative requirements that we all refer to as "compliance," and I don't disappoint: HIPAA, SOX, GLB, FISMA, 21 CFR, PCI DSS, and more. The list is long and growing, but all of them have common general themes when it comes to securing and managing the transfer of files.

They usually all require something like this:

  • Data must be protected in‐transit to prevent unauthorized disclosure, which usually means using encryption of some kind
  • The transfer of data must be logged in a tamper‐proof or tamper‐evident log so that auditors can see who transferred what, when they did so, whom they sent it to, and so on
  • Only authorized individuals should be able to access and transfer data
  • In some cases, non‐repudiation is required, meaning that there must be proof that the data was received by a particular system or individual so that the recipient cannot legitimately deny having received the data

But these are hardly the only external requirements that companies must deal with today. In many cases, external vendors or business partners may also set requirements for how their data must be handled. Go back to the dot com example I described earlier, and imagine that you work for one of the companies that we sent orders to. Those orders included customer information, and we, as your customer, had some requirements and expectations about how our customers' information would be handled: We didn't want that information transmitted to anyone else without our permission, and we wanted accountability for how that information was stored, accessed, used, and transmitted. Without assurances that our expectations would be met, we wouldn't do business with you. In fact, customer expectations and assumptions are a major external requirement on data security. Let's say you stay in a hotel in a different country. In many cases, you'll be asked to show your passport when you check in, and the hotel may record your passport number, name, address, and other personal information. There's no worldwide rule on how that information must be protected, but you certainly expect that the hotel will keep your personal information under wraps, and you assume that they won't go sharing it with anyone inappropriately. You might use your credit card to pay for the in‐room Internet access, and you would expect that information to remain private as well, even though that information might be shared by the hotel, the Internet service provider (ISP), a billing company, and possibly other parties.

I definitely had those expectations and assumptions when I checked into a hotel in Europe, and used my credit card to pay for the in‐room Internet access. Several weeks later, when large, fraudulent charges started showing up on my account, I was a bit shocked. After some investigation, it turned out that the hotel collected my billing information for the Internet access and transmitted those files in batches by unsecured, unmanaged FTP to the Internet provider for archival purposes. Somewhere during that transfer process, the data was accessed and several credit card numbers "lifted" and used for fraudulent charges. Although neither the hotel nor the Internet provider broke any local laws, they certainly incurred the one penalty I could impose: I'll never do business with either of them again.

The fact is that many companies move all kinds of data from place to place, all the time. It's so commonplace that we barely even think of it; it's so easy in most situations that we definitely don't ever think twice. But simply moving data from place to place can be incredibly risky, and even if you're not violating internal company policies or legislative requirements, you may still leave yourself open to customers' wrath. That's one of the big reasons Gartner feels that MFT is such an important part of any business these days: We need to move files around, and we must do so in a secure, managed fashion.

High Availability

Let's take a break from security for a bit because it's hardly the only reason companies start looking at MFT solutions. High availability is another strong business driver for something better than simple FTP clients or email attachments; companies who rely on file transfers need their file transfer solution to be available all the time.

For example, let's go back to my dot com example. Originally, I had set up a bunch of FTP scripts on one of our servers to send drop‐ship orders to our vendors. One day, that server stopped working. Nobody noticed because the server wasn't used for much else—it also had a bunch of archived product graphics and stuff, but nothing anyone had to get to continuously. In fact, it was a couple of days before we noticed it was down—and only because one of our drop‐ship vendors called our sales manager to ask why we'd stopped sending orders every day. Oops. Obviously, we implemented monitoring solutions right away, but then I started thinking about it: Monitoring would tell me that there was a problem, but in our line of business, we couldn't afford for there to be a problem in the first place. What we needed was a set of two servers to handle file transfers so that if one broke, the other could take over. We eventually set up something like Figure 1.1.

Figure 1.1: Highly­available file transfer.

Basically, we had two independent file transfer systems, each with a configuration database. The databases replicated with each other, so we only had to manage one of them and whatever we did would replicate to the other. The two coordinated, assigning jobs to each other so that they both had a roughly even workload. If one went down, the other would just pick up all the file transfer work. Incoming file transfer connections were balanced between them, so we'd have to lose two servers before we lost the ability to send and receive files. At first, we thought something like "load‐balanced FTP clusters" were kind of ridiculous, but our CEO assured us it let him sleep better at night. Literally 100% of our business depended on incoming and outgoing file transfers; it was ridiculous, he said, that we had four load‐balanced Web servers and only one file transfer server.

There are other reasons to create load‐balanced, highly‐available "file transfer farms." One might be to geographically distribute load. For example, if you have an office in the US and one in China, and frequently do file transfers within each continent, you might want to set up a server in the US and one in China to handle transfers within those continents. If the servers could be combined in some fashion, they could also offer failover for each other: File transfers from the US server to the Asian continent might not be as efficient, but it would be better than nothing.

Justification!

It's worth spending a little time thinking about what downtime in file transfer capabilities actually cost you. For my dot com company, it literally cost us tens of thousands of dollars in refunded orders to annoyed customers whose orders were delayed by 2 days. Knowing the cost of downtime will make it easier for you to balance the cost of adding high availability to your file transfer infrastructure; there are often many ways in which an MFT solution can be built for high availability, and knowing your cost threshold will help drive the necessary design decisions.

Communications Protocols

Businesses' file transfer needs are also strongly driven by the communication protocols they use to transfer files. In the ancient past of IT—say, 6 or 7 years ago—it was acceptable for companies to adopt proprietary protocols, forcing business partners to adapt to them. Today, with the wide availability of robust, open protocols, asking a business partner to switch to a different protocol is basically a slap in the face. The problem is that there are so many open, common protocols! Once a given business adopts one, they hate switching to something else, so in some cases, you have to be the flexible one, offering support for as many protocols as you practically can while meeting your other business requirements.

Today, the number of open, readily‐available file transfer protocols is pretty large:

  • AS1, AS2, and AS3. These applicability statements describe how to transport data securely and reliably. Security is usually based upon digital certificates and encryption. AS1 is based on the SMTP (mail transfer) and S/MIME (secure file encoding) protocols. AS2 is built around HTTP and S/MIME. AS3 utilizes FTP.
  • Network file copy. This is simply an automated version of dragging files from a network drive to a local drive or another network drive, suitable for use within an intranet environment or over a Virtual Private Network (VPN). Common protocols for network file copy include Server Message Blocks (SMB, used by Windows) and Network File System (NFS, common on Unix‐based systems).
  • HTTP. The standard protocol for transferring Web pages between servers and browsers, HTTP is suitable for transmitting any kind of data. Web Services protocols (REST, SOAP, and so forth) utilize HTTP to transmit data, for example. HTTP is not intrinsically secured.
  • HTTPS (HTTP over TLS). By adding Transport Layer Security (TLS) to a normal HTTP connection, you can add both encryption and authentication, helping to secure the entire connection and the data being transmitted.
  • FTP. The granddaddy of Internet‐based file transfer, FTP is generally quick and efficient, but it lacks any kind of intrinsic security, including encryption. It is extremely widely available, however, with FTP clients installed on virtually every
  • kind of modern computer operating system (OS).
  • FTPS (FTP over TLS). This is an extension to FTP, adding TLS (often but incorrectly referred to as SSL). This is one of the most common forms of "secure FTP," a term that in practice can refer to several protocols.
  • Secure File Transfer Protocol (SFTP). This is a part of the "Secure Shell" (SSH) protocol; it is also referred to as the SSH File Transfer Protocol. This isn't quite the same as running a normal FTP session within an SSH session (that's next), but is rather a completely unique protocol. SFTP is not to be confused with the Simple File Transfer Protocol, which is also sometimes referred to as SFTP.
  • FTP over SSH. Yet another "secure FTP" variant, this protocol tunnels a normal FTP session through a Secure Shell (SSH) connection. This is also referred to as "Secure FTP." The actual FTP traffic is unsecured, but it runs through a secured, encrypted SSH session.
  • Secure Copy Protocol over SSH (SCP over SSH). This works similarly to FTP over SSH, running a Secure Copy (SCP) session tunneled through an SSH connection. SCP normally encrypts transferred data, but the SSH tunnel also encrypts authentication and other traffic.
  • SMTP/POP3. Email can be used to transmit data—after all, an email is really just data of some kind in a text format. Some organizations may send or receive data via email protocols, and SMTP and POP3 provide that capability.

These are the most popular and commonly‐used open, Internet‐based protocols in use today for file transfers. In fact, there are about a half‐dozen other FTP variants (as if there weren't enough already), semi‐proprietary file transfer protocols, and more. The list I've provided, however, contains the protocols that 99% of companies will be using for 99% of their file transfers.

To be frank, I consider almost all of these to be "must‐have" protocols for a file transfer infrastructure. In any company I've ever worked for, either as an employee or a consultant, we've eventually needed to use almost all of these at one time or another. Even if we started out only needing, say, FTPS, we would run into other business partners who preferred SFTP, or a system that could only accept data via HTTPS, or a business unit that was receiving information via SMTP and POP3. It's become easier for me to simply specify all of these common protocols as base requirements, as that usually leads me to a solution that will last me longer, and serve in a larger variety of business situations.

Programmability, Customization, and Workflow

I worked for a bookstore chain at one time. This was before Amazon.com really owned the book universe, and so we had plenty of brick‐and‐mortar competition. We didn't really compete on product; we all carried basically the same books, and could special order anything else a customer might want. We didn't even compete on price; publishers set the suggested prices for books and drive most of the promotions, so we all tended to have the same prices and discounts on the same books. What we competed on were our business processes. We worked hard to be better at stocking our stores with new titles, restocking old ones quickly, and so on. My point is that no two companies are identical, even if they're selling identical products at identical prices.

Because file transfer is so closely tied to business processes, you should therefore expect everyone's file transfer needs to be slightly different. That means a file transfer infrastructure has to work the way your company needs it to—not in some generic fashion that your company has to adapt to.

There are many ways in which MFT solutions accommodate different business processes. Some solutions offer an Application Programming Interface (API) that allows your own software developers to create custom file transfer consoles and clients or to incorporate file transfers into line‐of‐business applications and other custom business processes. These APIs may work for Microsoft's .NET Framework, Microsoft's Component Object Model, Sun's Java, or some other development platform. Some vendors may offer a variety of APIs to accommodate a variety of development languages.

Vendors might provide a custom scripting language, or support existing scripting languages, so that you can "program" your MFT solution by writing simpler scripts and batch files. Others might provide command‐line utilities that can be scripted by an experienced administrator or programmer to customize specific operations to meet the company's business processes.

Another way in which MFT solutions can be customized is through workflow. Typically, this involves much less expertise and overhead than programming. In addition, this option offers a solution to situations in which one person may queue up a file for transfer but another has to review and approve it, and you want to track all that review/approval activity in a log of some kind. Workflows within your business may be simple "review/approve" workflows or they may be complex processes that mirror specific business processes that have been defined within your organization. Solutions do vary considerably in how they allow you to define these workflows: Some may actually require some level of scripting or programming, while others may use graphical UIs (GUIs) to let you visually connect workflow components into a complete process. I'm definitely a bigger fan of the "graphical" style of workflow construction, as it allows less technically‐skilled users—such as business process owners, rather than programmers—to construct workflows and even maintain and modify them on an ongoing basis.

Integration with Existing Technology Assets

Your business likely already has a significant technology investment, and adding a formal file transfer infrastructure shouldn't require you to re‐build much of that infrastructure. Ideally, an MFT solution should integrate well with other technology assets, such as existing Web sites for data transfer or directory services for user authentication.

There are a number of potential integration points that you can consider for your file transfer infrastructure:

  • Directory services. Whether it's Active Directory (AD) or some other directory, you might want to have your file transfer infrastructure authenticate users from an existing directory services. More and more organizations are seeking to reduce their total cost of identity and access management (IAM), and solutions that utilize an existing directory—rather than adding another user database to the environment— help support that cost reduction.
  • Databases. A file transfer infrastructure requires a database to store configuration, job, schedule, and logging information; the ability to use existing database resources—such as an existing Microsoft, Oracle, or IBM database server—can be a benefit to some companies. Using an existing database means the file transfer solution will add less administrative overhead to the environment. However, some MFT solutions are entirely self‐contained, using their own internal database. Provided that internal database doesn't require excessive maintenance and administration, it may not add enough overhead to worry about.
  • Web servers. Some MFT solutions offer Web‐based UIs, especially for users authorized to create ad‐hoc transfers. Figure 1.2 shows an example of a Web‐based interface. Some solutions may be able to expose this interface through existing Web servers; others may have an embedded Web server that requires no additional maintenance. Less desirable are solutions that require you to add a specific new Web server that you otherwise wouldn't need, such as adding an Apache server to an all‐IIS shop or vice‐versa.

Figure 1.2: Web­based file transfer UI.

Other integration points might relate to specific line‐of‐business applications, such as predefined workflows that integrate with inventory systems, customer management systems, and so on.

Scheduling and Monitoring

Scheduling is important for recurring file transfers, of course, but it can also be important for ad‐hoc transfers, either from system to system or person to person. In other words, just because I want to set up a one‐time file transfer doesn't mean I want it to happen right now; I may need it to happen later in the day or even on a specific day in the future. Your scheduling needs may be fairly simplistic or quite complex. Figure 1.3 shows what a file transfer infrastructure might offer in terms of a fairly simple, straightforward UI for scheduling one‐time or recurring transfers.

Figure 1.3: UI for scheduling transfers.

Monitoring really falls into two broad categories. The first is the ability of individual users to monitor their file transfer jobs; Figure 1.2 showed how a simple user‐oriented interface might do that. Other systems might offer users the option of receiving status emails when their jobs complete, run into a problem, and so on.

The other category of monitoring is the broader, IT operations‐level monitoring. IT needs to be able to monitor the health and performance of the file transfer infrastructure, receive alerts when something goes wrong, and so on. This monitoring might be accomplished by a console or utility specific to the MFT software that you implement. In other cases, a solution might provide monitoring that integrates with other monitoring consoles, such as HP OpenView, IBM Tivoli, or even Microsoft's System Center Operations Manager. In still other cases, the solution might simply expose monitoring instrumentation, such as Windows performance counters, which can in turn be accessed by a wide variety of monitoring tools.

Both types of monitoring are useful and desirable. Users will be more likely to use a file transfer solution if they don't feel that they're dumping their transfer requests into a big, black box; being able to check the status of their jobs and receive notifications adds a level of confidence, and confidence leads to usage. Operations‐level monitoring is obviously crucial for any mission‐critical service; IT needs to be able to spot upcoming problems based on patterns and trends, and needs to be alerted quickly if something fails or goes wrong.

File Transfer Frequency and Volume

This is a tricky business requirement that I think a lot of people overlook—I certainly did when I started dealing with automated file transfers back in the day. In fact, there's a good story here. I used to work for a network engineering and management company, which was a sub‐division of a regional utility company. To make a long story short, we offered our customers the ability to have our services appear on their utility bill—much as you might pay for satellite television on your telephone bill today. Actually making that happen was a lot harder than you might think: We not only had to get our data into the right form but we had to transmit it in a very specific fashion, at very specific times. We didn't realize how specific those times were until, one month, we transmitted everything a couple of days early to work around a holiday.

It turns out that the reason our delivery schedule was so specific was that the utility's file transfer server was single‐threaded—it could only handle one file transfer job at a time. Crazy, right? And the day we decided to send our billing information happened to be the day that another subdivision was assigned—and so we basically crashed the whole system. Oops.

The moral of the story is this: Think about the frequency and volume of your file transfers, and build your file transfer infrastructure appropriately. Find out if there's a limit on the number of simultaneous tasks, for example, especially if you'll be implementing complex workflows for file transfer. Get a feel for the maximum sustained data throughput your proposed infrastructure can support, and decide whether it'll be sufficient for your purposes. If you'll be handling a truly enormous amount of file data, you may even need to consider load balancing within the file transfer infrastructure. That load balancing can also provide a degree of high availability, helping meet those two business needs.

Content Security—Preventing Malware

There is a ridiculous amount of malware out there today, and a file transfer infrastructure offers a huge opportunity for more of it to enter your environment—and an opportunity for malware in your environment to spread to other environments. Fortunately, most of what a file transfer infrastructure handles are simple data files—CSV files, XML files, and so on— that don't contain any executable code. But some files that move through your infrastructure will contain executable code, and you need to deal with it.

My preference is to not have the file transfer infrastructure implement its own antimalware measures. I have more than enough anti‐malware software in my environment that needs to be kept updated; I hardly need another. What I prefer to see is a file transfer infrastructure that can use the anti­malware stuff I already have. In some cases, that may mean simply dumping files to disk where my anti‐malware software can scan it like it does every other new file. Higher‐levels of integration may allow file transfer infrastructure components to actually submit incoming files to the anti‐malware engine before writing it to disk or doing anything else with it.

Another scenario is a file transfer infrastructure that doesn't integrate with the specific anti‐malware solution you have but does integrate with an existing, third‐party, wellknown anti‐malware solution. I'm okay with that, too, in large part because it brings another malware‐scanning engine and technique into the environment, meaning my total anti‐malware effort is more likely to catch everything.

Cost

This last business driver is hardly ever the least important: How much will it cost? All of the other business concerns are—and should be—weighed against the cost. If high availability costs significantly more than what would be at‐risk for not having high availability, then you don't get high availability.

Ideally, a file transfer infrastructure should be somewhat modular. Businesses shouldn't be forced to buy a "one size fits all" solution, because no business is exactly like another.

Modular components—making high availability an option, for example—help business customize a solution that fits their needs and risk mitigation requirements in a costeffective fashion. When I'm building any kind of infrastructure service, including file transfer, I like to look for solutions that offer just one or two feature‐related "editions," and then lots of "options." That way, I can build what I need now, and add options later as I need them and can afford them. It's a bit like buying a full‐sized computer over a laptop: With the laptop, you get everything in one package, but you have to be careful to buy the best one you'll ever need because they're relatively difficult to upgrade. A full‐sized computer, however, can usually be opened up, and its components can be upgraded, swapped out, and so forth, all with relative ease. That means you can buy a relatively lowpowered computer to begin with, then upgrade specific options as needed in the future.

Perhaps most important, though, is something I'll address in more detail in the next chapter: The cost of doing nothing. In other words, if you have file transfer needs, you can't just consider the cost of a file transfer infrastructure as overhead. Why?

If the business truly needs something, that something will get accomplished somehow. If it isn't through a formal file transfer infrastructure, then it will be through an informal infrastructure, often composed of cobbled‐together components, do‐it‐yourself scripts, and so on. Those are not free. It can be very difficult to discern their true cost, but there is—as we'll see in the next chapter—a cost in maintenance, risk, and so forth. Compare that with the cost of a more formal, integrated, supported file transfer infrastructure that meets all your business needs.

Common File Transfer Myths

As I work with consulting clients and as I speak with IT professionals at various conferences and tradeshows, I encounter more than a few misconceptions and bad assumptions related to file transfer. Some of these myths range from relatively minor misunderstandings to extremely major beliefs that actually hold back the person's entire organization. Let's play "Mythbusters" and examine some of these myths. I'll look at the most common ones I run across, explain where they came from—because many of them do, in fact, contain a nugget of truth—and see how they hold up to cold, hard facts.

Myth 1: Security Is Not Important

This is probably one of the first and most wildly inaccurate myths I run across. I can barely comprehend anyone in a modern business environment believing that security really isn't important. Today's businesses know that security is important—and therein lies the grain of truth in this myth. Today's businesses do care about security; businesses of yesteryear often did not. In fact, even through the late 1990s, many businesses simply didn't focus very much on security. It wasn't at all unusual for file servers to contain a permission for "Everyone: Full Control" at the root of their hard disks, with that permission inheriting to every file and folder on the server. I worked for one company in the late 1990s that assigned a public IP address to every computer on the network and didn't have a firewall between their network and the Internet. Unthinkable a decade later, but in the past, there simply weren't quite as many security threats, and so there wasn't much security focus. In many respects, security wasn't important, at least not for many businesses.

If Security Isn't Important, Then Why…

I enjoy having conversations with clients who start by saying, "security isn't really a concern for us." In most cases, what they're telling me is that they don't want to use security as a blanket argument in favor of implementing some technology solution—they want other reasons to implement something. Which is fine. But prior to having that conversation, I usually had to check in with a security receptionist, sign into a log, be issued a guest badge, get escorted through card key‐protected doors into a conference room, and had my route monitored by security cameras. And "security isn't a concern" for them?

I think "security" gets pulled out as a business driver so often that business people—especially business technology people—just get sick of it. It seems like every IT vendor in the world tries to use "security" as a way of getting their foot in the door or closing the sales pitch. And that can definitely be frustrating, but it's disingenuous to say that "security isn't a concern for us."

Trite as it is, security has to be a concern. It's rarely the only concern, but it's always going to be there. I've never met a single company who could happily live without any security concerns at all—everyone locks the doors to the office, keeps the cash in a safe, and so on. IT security is no different—we all get tired of hearing about it and reading about it, but it's something we have to pay attention to.

Today, of course, the world is different. There are a few more security threats out there, and we've become aware of many more security concerns than we were in the past. Nobody in their right mind would operate a network without one or more firewalls, without anti‐malware tools in place, and so on. There are really two reasons that today's companies focus a bit more on security than they did in the past: internal concerns and external requirements.

External requirements are fairly new, coming into play in the mid‐ to late‐1990s. These are often imposed by governments or by industry groups, typically focused on a single business industry or class of companies and are often designed to bolster consumer protections or government oversight. In the US, common external requirements include:

  • The Payment Card Industry (PCI) Data Security Standard (DSS)
  • The Health Insurance Portability and Accountability Act (HIPAA)
  • The Gramm‐Leach‐Bliley Act (GLBA)
  • The Sarbanes‐Oxley Act (SOX)
  • Various US federal requirements for government agencies and contractors

At a very high level, these requirements have some common goals. Each of them focuses on a particular kind of data; HIPAA covers patient healthcare information, for example, while SOX focuses on the financial reporting of publicly‐traded companies. For the information that these requirements focus on, they typically require that:

  • Information not be disclosed to unauthorized entities
  • Access to information be audited in a permanent log
  • Information be verifiably secured to prevent unauthorized disclosure
  • Information be retained for some specified period of time (and often no longer)
Security Is Not Just Security

Security per se isn't what many companies are most concerned about, especially companies subject to external requirements. Former Gartner research director L. Frank Kenney said, "It's never just been an issue of security, although security is one of those low‐hanging fruits that get everyone's attention. More companies are affected by a failing audit and the fear of failing audits. The bigger issue is: Can I be assured, can I show, that a file that moves from point A to B has been secured, that the person who sent it had the authority to send it and was authenticated?" and so forth. Security is simply the broad term we use as a shorthand way of referring to all of these concerns.

Although the specifics, of course, differ between them, these common high‐level goals make it clear that security is important—to someone watching over your company, if not to your company itself. These external requirements often come with fines for non‐compliance, and can in some cases restrict or remove a company's ability to operate in a given industry. The PCI DSS, for example, is enforced by credit card companies, and failure to comply with its requirements can result in a company losing the ability to process credit card transactions—a pretty serious obstacle for many companies.

Internal requirements encompass all the things that a company does for its own benefit. In some cases, companies' internal requirements mirror and supplement those of external requirements—because, in many cases, the external requirements are just a codification of "doing what's right" in the first place. In other cases, internal security requirements help protect the company from things it perceives as damaging, such as industrial espionage and intellectual property theft and disclosure of customer (or other) information that would result in damaged customer relations.

So where does all of this fit in with file transfer? Security in the IT world nearly always refers to the security of information—the "I" in "IT." Information tends to leave a company in one of two major ways: via email and via file transfer. Simply put, if you're going to find a violation of internal or external security requirements, you've got even odds that file transfer will be behind the violation. From a business perspective, you should really have a couple of major security goals:

  • Ensure that only authorized users can transfer files and that only authorized recipients receive them. As this is often impractical, you at least want to keep track of who sends what to whom so that you can take corrective actions if necessary.
  • Ensure that only the intended recipient can access transferred files. In other words, you don't want unintended people snooping on your information transfers.

There are some other more subtle security goals, but I'll save those for later in this chapter. At a basic level, making sure only authorized entities have access to your information is the primary goal of security—and it plays a strong role in file transfer.

Myth Busted

Security is important, and today's companies care. File transfer is a major opportunity for security problems, and security has to be a major consideration in any file transfer scenario. In fact, security is so important, that you shouldn't even be talking to vendors who can't lead off the conversation with a rock‐solid security story.

Myth 2: Homegrown Is Cheaper

Most companies have relied—at least briefly—on homegrown file transfer "solutions" similar to the following example:

set locus local ; Avoid K95 LOCUS popup ftp rawhide.redhat.com /noinit /anonymous if fail end 1 Connection failed if not \v(ftp_loggedin) end 1 Login failed

ftp cd /pub/redhat/linux/rawhide/i386/Red at/RPMS/H     if fail end 1 CD to RPMS directory failed

set ftp dates on ; Preserve file dates set xfer display brief ; FTPlike transfer display      ftp type binary ; Force binary mode

set incomplete discard ; Discard incompletely received files set ftp collision discard ; Don't download files I already have

if >= \v(version) 800205 set take error on

; Get the files...

mget libstdc++* glibcdevel* ncurses*

mget /except:{*devel*} 4* GConf2* Glide3* LPR* MAKE* O* PyXML* SysV* V*

mget /except:{{*devel*} {a[bm]*}{anaconda*}{asp lle [az]*}} a* mget /except:{{*devel*} {balsa*}{bashdoc*}} b* mget /except:{{*devel*} {chromium*}{compat[dgl]*}{cvs*}{cWnn*}} c* mget /except:{{*devel*} {db4[uj]*}{ddd*}{d[bd]skk*}{ esktopd    *}{dia*}

{docbookstyle*}{doxygen*}} d* mget /except:{{*devel*} {eel*}{emacs[el]*}{emacsp*}{epic* {evolu*}{exmh*}    }} e* mget /except:{{*devel*} {festival*}{fonts*}{freeciv*}} f* mget /except:{{*devel*} {g[ailnt]*}{gcc[gjo]*}{gd[bm]*}{gedit*}

{gphoto2*}{gsl*}} g* mget /except:{{*devel*} {gaim*}{galeon*}{gated*}{gtkengines*}{gtkhtml*} {gimpprintc*}{gimp[d,09]*}} ga* gi* gt* mget /except:{{*devel*} {*.i686.rpm}{glade*}{glibc[dp]*}} gl* mget /except:{{*devel*} {gnomem*}{*game*}{*user*}{*pilot*}{*audio*}

{gnucash*}{gnumeric*}} gn* mget /except:{{*devel*} {ht[td]*}{im*}{inn*}{imap*}{isdn*}{itcl*}} h* i* j* mget /except:{{*devel*} {k[emnopsvW]*}{kde*}{klettres*}{kt uch*o    } {kakasi*}{kappa*}{krb5serv*}} k* mget /except:{{*devel*} {kdei18n*}{kde[2gmtv]*}{kdeartwork*} {kdebindings*}{kdepim*}{kdesdk*}} kde*

mget /except:{{*devel*} {kernel[bBdsu]*}} ke*.i386.rpm mget /except:{{*devel*} {kmail*}{knm*}{knode*}{koffic *}{kooka*}{kppp*e    }

{kstar*}} km* kn* ko* lp* ks* mget /except:{{*devel*} {libgcj*}{libgnat*}{libtab*}{lic*}{la[mp]*}{lf p*t }} l* mget /except:{{*devel*} {m[cguxy]*}{mailman*}{manpag se [az]*}{mew*}

{miniChin*}{mod_*}{mrtg*}} m* mget /except:{{*devel*} {nautilus*} ncpfs*}{nmh{    *}{noatun*}{nss_*}{nut*}

{nvi*}} n* mget /except:{{*devel*} {*.i686.rpm}{octave*}{open[ho]*}

{openldap[cs]*}{openmotif2*} openssl{ 0*}} o* mget /except:{{*devel*} {p[hvwx]*}{pan*}{perlPDL*} {post*}{pydict*}{pythond*}} p* mget /except:{{*devel*} {qtdesign*}{qt2*}{quanta*}{recode*}{ruby*}} q* r* mget /except:{{*devel*} {s[eqwy]*}{sane*}{skkd*}{snavig*}{splint*}{stard*}} s* mget /except:{{*devel*} {sendmaildoc*}{sylph*}{swig*}} se* sw* sy* mget /except:{{*devel*} {t[oW]*}{t*fonts*}{tclx*}{tetex*}{timidity*}

{tripwire*}{tuxracer*}} t* mget /except:{{*devel*} {unixODBC*}{uucp*}{vim[eX]*}{vnc*}} u* v* mget /except:{{*devel*} {w[3l]*}{wine*}{wordtrans*}} w* mget /except:{{*devel*} {x[aef]*}{xc[dhi]*}{xine*}{xmms*}{xpdf[az]*}

{xsane*}{xtrace*}} x* mget /except:{{*devel*} {XFree86[cdIX]*}} X* mget /except:{{*devel*} {zebra*}{zsh*}} y* z*

end 0

This script automates a command‐line File Transfer Protocol (FTP) client. In this case, it's retrieving OS updates from a vendor's servers. It works. It isn't terribly pretty, but it works. Scripts like this have existed since the advent of FTP, and they will probably continue to exist for years and years. Systems administrators rely on them to automate complex tasks—clearly, the task automated by this script is fairly complex, and you certainly wouldn't want to type all those commands manually on a regular basis. And this is what FTP scripts commonly look like: A series of complex commands that execute in a sequence. But is a homegrown "solution" like this one cheaper than a commercial file transfer application? It depends on what you mean by "cheaper." A homegrown solution is certainly cheaper than running the same commands manually; the time saved by the administrator is probably pretty easy to figure out. There's no question that automation saves time.

But a script like this can be tricky. FTP itself doesn't support a great deal of diagnostic logging; if a single command fails, you won't necessarily know about it. If your script is implementing a mission‐critical business function—such as transferring data to a business partner on a regular basis—you may not know a problem exists until someone calls and asks where there data is.

Scripts like this can also be expensive to maintain. It's great to have someone on your team who has the knowledge and skills to write such a script. If they're the only one on the team, you'd better keep them happy, because if they leave, all of a sudden your convenient scripts become a huge liability. If one stops working, you may not be able to fix it quickly— meaning your business may suffer.

Homegrown = Very Expensive

Homegrown solutions aren't limited to FTP scripts. Sometimes companies accidentally put themselves into the software development business to solve a particular line‐of‐business problem in their own specific way. HewlettPackard did so, using an in‐house system named Omega to keep track of commissions for the company's salespeople. Omega was the perfect example of a homegrown solution: It was completely outside the company's core expertise, was developed more than a decade ago (it started at DEC), and probably did a great job in the beginning and was never something the company intended to sell. But in 2009, the company was hit with a classaction lawsuit because the homegrown solution had improperly compensated some 2000 salespeople. The problem was simply that HP was bigger than Omega could handle, despite efforts to keep the software updated.

Christopher Cabrera of Xactly Systems says, "That's one of the big problems with homegrown systems: they can cost you big time, often just when you need them the most. In terms of hard dollars, homegrown systems are pricey to build, costly to maintain, difficult to economically scale, and expensive to modify in the face of business change. In terms of opportunity costs, they lack the latest features and functionality, and they slow down and mess up vital business processes and initiatives."

Getting back to file transfer, an Information Security Magazine article quoted Gartner analyst L. Frank Kenney, "Most places have homegrown solutions. It's not greenfield; just about everyone has leveraged FTP." The article also noted that most industry experts believe, "If you roll your own, the cost is very high. And, you don't have a consistent way to manage security around file transfers."

Often, homegrown solutions start with a pretty simple objective: move this file from here to there. Then one day a problem occurs, and it takes a few days for anyone to notice. So the homegrown solution is modified to include some logging and perhaps email notifications of success or failure. Then some security auditing is added. Then the script is modified to support encrypted connections. Server names change, so the script is modified to have the new connection information. An OS upgrade breaks the script, so it's rewritten. Over time, without realizing it, you've spent a lot of time and money on this homegrown, "cheaper" solution. Your company is now in the business of application development and support— albeit in a part‐time, unintended fashion.

My point is this: Homegrown solutions seem cheaper at the start. They rarely stay that way, and their costs creep up on you. Homegrown tools rarely meet every business need—did you see any security auditing in that FTP script? Any reporting? Anything to prevent use by unintended individuals? I have never been to a consulting client that didn't have some kind of homegrown file transfer tools in place, and I have never spoken with a client who didn't regret, to some degree, having such a strong dependency on those homegrown tools.

Myth Busted

If something is important to the business, then it should be done using business‐grade tools, not homegrown hacks. Unless your company writes file transfer software for a living, you shouldn't be writing file transfer software—like scripts—at all.

When Homegrown Isn't Bad

If you've ever looked at a commercial file transfer solution, you know many of them support some level of automation and customization—often through scripts. So why aren't those scripts the same as the "bad" ones I've been discussing?

A "plain" FTP script starts with very little functionality beyond moving files back and forth—it's not a full programming language, there's no logging, no security, none of that. Adding those capabilities takes even more programming, and that's where you start really spending money on something that you thought was "free."

A file transfer solution, however, has all those capabilities built in (assuming it's a decent one, of course). A "script" that runs within that solution is simply invoking pre‐built capabilities and functionality in a specific order—literally a script, telling the actor (the file transfer software) what lines to read (what actions to take) in a specific order. Those scripts aren't as hard to maintain over time because they tend to be less complex.

A really good file transfer solution, however, will allow you to "write scripts" without writing scripts. Some offer graphical user interfaces and "wizards" to help build automated task sequences. Behind the scenes, the result might indeed be a script, but you'll be maintaining it in a way that requires fewer specialized skills and less effort, so you'll keep your overhead lower.

Myth 3: File Transfer Is Just FTP

One reason that many companies start with homegrown tools for file transfer is the perception that "file transfer" is nothing more than FTP, and that in many cases it's a scheduled, automated FTP between two servers—such as transferring data on a recurring basis to or from a business partner. In the 21st century, though, FTP isn't the only game in town. First, there are the numerous variants of FTP that I mentioned in the previous chapter: FTPS, Secure FTP, SFTP, and so on. There are also network copy protocols, such as Server Message Blocks (SMB), Common Internet File System (CIFS), and so on. Even email has become a means of file transfer by simply attaching files to email messages.

You don't always get to pick which file transfer technique or protocol you use. You might be happy using FTP because it's easy to automate with those homegrown scripts; a new business partner, however, might insist that data be transferred using the HTTPS protocol—meaning all your FTP scripts are useless. Another new business partner might only want to send data via email attachments—again negating all your FTP skills. Another business partner might require you to transfer data according to the AS2 specification— and your on‐staff FTP jockeys might not even know what that is (it's a combination of HTTP and S/MIME) let alone how to crank out a script for it.

In today's business world, flexibility pays. If your business wants to transfer data to a partner, and that partner requires the use of something like AS2, which answer would you rather give your executive team?

  • "Um, no, we don't even know what that is, let alone have the ability to write a script to do it."
  • "Sure thing, boss."

One reason that file transfer has moved beyond FTP is that FTP is a fairly primitive technology. It was invented in 1971, after all, and despite numerous updates over the years, it doesn't provide a lot of the features folks needs these days—like built‐in encryption, delivery confirmation, security logging, content encoding, and so forth. FTP still has its place, as it's fairly simple to use and is available on almost every computer OS in existence—but it isn't the only game in town.

Myth Busted

File transfer can be a lot more than FTP, and you won't always be able to force FTP as the file transfer solution for a given scenario.

Myth 4: Email Is Safe and Secure for File Transfer

This is a myth I've spent a long time arguing about. Let me first acknowledge that of course an email can be encrypted, digitally signed, and so forth, and that most email systems support delivery confirmations and other feedback mechanisms. As a means of moving data from one place to another, email is simple to use, fairly reliable, and broadly accessible. That does not mean it is "safe and secure" for business‐critical data. True, sometimes email might be your only option (if a business partner insists on it, for example), but that doesn't mean email isn't without significant downsides.

File Transfer Isn't Just for Servers

In a book like this, it's very easy for me to fall into the pattern of talking about system‐to‐server transfers. That's the type of file transfer I typically work with most, as many of my clients have hired me to automate regularlyoccurring transfers, typically between one of their servers and a server owned by one of their business partners.

System‐to‐system file transfers are, for me, easy. You plop a file transfer solution in place, set up connection parameters, pick a file transfer protocol, and set a transfer schedule. Everything happens in the background, and nobody but an administrator has any access to change things or mess things up.

In an Information Security Magazine article, John Thielens was quoted as saying, "Ad hoc file transfer is an important trend. You think of managed file transfer as something scripted or for techies, but there's also file transfer technology for human‐to‐human collaboration."

It's true. Those system‐to‐system transfers are a tiny fraction of a company's total file transfer volume. Most data is moved around by end users on an adhoc basis, most often in the form of email attachments. I can't very well expect end users to log into the corporate file transfer solution, set up connection parameters, specify a transfer schedule, and so on—those end users don't have the knowledge to do so, would probably mess up other people's transfer jobs, and quite frankly don't want to go through all that hassle just to get a spreadsheet over to a business partner. That's why email attachments are so popular.

So why is email not "safe and secure?" Because of the way email functions. Look at this:

To: Don Jones <securitynews@xyz.com>

DeliveredTo: securitynews@xyz.com

Received: by 10.150.220.11 with SMTP id s11cs108046ybg; Mon, 28 Dec 2009 12:56:39 0800 (PST)

Received: by 10.142.2.14 with SMTP id

14mr10754165wfb.15.1262033799234; Mon, 28 Dec 2009 12:56:39 0800 (PST)

Received: from mail.wingateservices.com (mail.wingateservices.com

[68.142.139.11]) by mx.google.com with ESMTP id

6si89481422pzk.103.2009.12.28.12.56.38; Mon, 28 Dec 2009 12:56:38 0800 (PST)

Received: from rsa.wingateservices.com (rsa.wingateservices.com [172.17.8.10]) by mail.wingateservices.com (8.13.1/8.13.1) with ESMTP id nBSKubrk032469 for <securitynews@xyz.com>; Mon, 28 Dec 2009 13:56:37 0700

Those are the full message headers—something your mail client normally hides—from an email message I recently received (although I've changed the email addresses and server addresses for this example). This was hardly a "point to point" transfer; the headers show that my message traveled through something like four different mail servers before it got to me. Here's another:

Received: from smtpin137bge351000 ([unknown] [10.150.68.137]) by ms214.erp.com (Sun Java(tm) System Messaging Server 7u312.01 64bit (built Oct 15 2009)) with ESMTP id <0KVF00BPX2UW4630@ms214.mac.com> for john.doe@abc.com; Tue, 29 Dec 2009 05:59:20 0800 (PST)

Received: from mail.abc.net ([unknown]

[64.142.73.206]) by smtpin137.erp.com (Sun Java(tm) System Messaging Server 7u27.04 32bit (built Jul 2 2009)) with ESMTP id <0KVF0065Y2UVZN90@smtpin137.mac.com> for abc123@erp.com (ORCPT abc123@erp.com); Tue, 29 Dec 2009 05:59:20 0800 (PST)

Received: from server75.appriver.com

([207.97.224.142]) by mail.abc.net with Microsoft SMTPSVC(6.0.3790.3959); Tue, 29 Dec 2009 05:59:17 0800

Received: from [10.238.8.51] (HELO

inbound.appriver.com) by server75.appriver.com (CommuniGate Pro SMTP 5.2.13) with ESMTP id 954687680 for john.doe@abc.com; Tue, 29 Dec 2009 08:59:10 0500

Received: from tippit.wc09.net ([74.203.49.55] verified) by inbound.appriver.com (CommuniGate Pro SMTP 5.1.7) with ESMTP id

641729304 for john.doe@abc.com; Tue, 29 Dec 2009 08:41:39 0500

Received: from aweb06.whatcounts.com (172.16.2.16) by tippit.wc09.net (PowerMTA(TM) v3.5r15) id h783j60kup8j for <john.doe@abc.com>; Tue, 29 Dec 2009 05:14:12 0800

This time, the message went through six servers before it got to its destination. That's how email works—in fact, it's a major feature of the Simple Mail Transfer Protocol (SMTP) that powers Internet email. The idea is that mail can take many routes to get to its destination, so if one route is down, the mail can still get through.

That's all well and good when your email is a message to Mom, but when you're talking about sensitive business information, email's roundabout paths present some distinct disadvantages:

  • All this bouncing from server to server takes time. How often have you been on a conference call, sent an email to someone else on the call, and then waited— patiently, I'm sure—while they hit "refresh" over and over on their mail client? Critical business information may need faster, point‐to‐point delivery.
  • Email doesn't actually support true delivery confirmation. In other words, you the sender can't hand the email to the recipient and know that it's in the recipient's hands. What email can do—if the recipient's email system supports it—is send another email back to you letting you know they got it. Those confirmations can obviously be forged, mislaid, misrouted, or eliminated en‐route. Direct, point‐topoint transfers provide actual delivery confirmation.
  • Your email messages pass through many hands, and each server on the route can keep a permanent copy. In practice, most legitimate mail servers never do because of the space that would entail. But they can. And even if you've encrypted the email contents, giving someone a copy of the message will give them time to break that encryption. Older 40‐bit encryption—which is still in widespread use—can be broken by a home computer in a few days. You'll never know that the decryption is in progress because a copy of your email will have been forwarded to its intended recipient.
  • Even encrypted email messages contain certain information that might be considered sensitive. For example, the sender and recipient identities can't be encrypted because that information is needed to deliver the message. Anyone intercepting the message—whether it's encrypted or not—will know that those parties are in discussions. That knowledge helps someone involved in industrial espionage, for example, focus their efforts on the most promising messages.

For companies that spend thousands of dollars on card keys for their office doors; tens of thousands on computer security like password management tools, firewalls, and so on; and thousands more on security cameras—well, it's just amazing that those same companies would ever consider transferring sensitive data via email. You might as well print the sensitive information, make a few photocopies, stick the copies in envelopes marked "please do not read," and put them in the magazine rack in the office's lobby.

Email sees heavy use as a file transfer mechanism because it's usually the only thing your end users have that is broadly accessible and is compatible with just about every other company in the world. And, as I acknowledged earlier, sometimes it's the only way two parties can exchange files with one another. That does not, however, mean it's safe and secure, and it doesn't mean you shouldn't be looking at some other, better solution for adhoc, person‐to‐person transfers.

Myth Busted

Email is not as safe and secure as a point‐to‐point transfer that doesn't involve so many intermediaries. Email is convenient—but you can look for safer solutions that offer similar convenience.

Myth 5: Slick Means Functional

This is something I'll look at in more detail in Chapter 4, but for now, let's just acknowledge that the best‐looking solution isn't always the most‐functional solution. A pretty graphical user interface is nice, but when it comes to file transfer, it's what's under the hood— protocol support, security features, logging and auditing, stability and reliability, and so on—that matters most.

Myth Busted

Great‐looking file transfer solutions can offer great functionality—but more utilitarian‐looking solutions can also offer great functionality. In the end, you should choose a solution based on features and capabilities, not on visual appeal.

Myth 6: All Encryption Is Equal

There was a time—back in the mid‐1990s, probably—when 40‐bit encryption was the king of the security world. Today, as I've mentioned, a typical home computer can crack 40‐bit encryption in a few days using brute‐force methods. Nobody with any serious concern for privacy would rely on 40‐bit encryption today—but you'll still see software encryption packages touting 40‐bit encryption support.

Note

The only exception, and the right time to use 40‐bit encryption, is when it's the only option for legal reasons. Some countries prohibit the use of strong encryption without a special permit, and some extremely strong encryption algorithms can't legally be exported to certain countries, meaning companies in those countries don't have access to stronger encryption.

Slightly stronger 56‐bit encryption is little better; in 1998—more than a decade ago—a specially‐built hobbyist computer was cracking 56‐bit keys in a few days, and could decrypt 40‐bit encryption in a few seconds.

As a general rule, today's common encryption algorithms rely on 128‐bit, 192‐bit, and 256bit encryption keys; larger keys are possible but are rare. These key sizes all refer to symmetric encryption, which is one broad system of encrypting data. But key size isn't the only factor in encryption—the algorithm, or the way the key is used, also plays a role. For example, a 1024‐bit key used in an asymmetric system such as RSA is considered equivalent, from a security perspective, to an 80‐bit key in a symmetric system; 15,360‐bit RSA keys are considered equivalent to a 256‐bit symmetric key.

The point of all this is that all encryption isn't equal. That's not to say your company needs the security of a 512‐bit symmetric key, which is what the government requires for "Top Secret" information. However, you may well need just that. So as you look at file transfer solutions, make sure you're examining—in detail—their encryption capabilities. Simply offering encryption isn't enough; you need to know what kind of encryption.

One benchmark you can look for is Federal Information Processing Standards (FIPS) 140‐2 approval and validation. FIPS is maintained by the US National Institute of Standards (NIST) computer security division. FIPS are used by both the US and Canadian governments, and serve as guidelines to other governments as well as many private‐sector security specialists. FIPS 140 covers cryptographic modules; FIPS 140‐2 is (as of this writing) the most recent and stringent version of the standard (FIPS 140‐1 was written in 1994 and has not been in use since 2003).

Validation is the word to look for. Software developers can claim that their products are "written to FIPS 140‐2 standards," but unless they've been independently tested to verify the claim, it's just words on paper. NIST co‐administers a validation program, called the FIPS Cryptographic Module Validation Program (CMVP—you knew there was another acronym coming), which is used to test software for compliance with the FIPS 140‐2 standard. The program isn't just an automated test of the software, either; it includes a review of the product's design document and source code, as well. Validated software has been confirmed to use the approved cryptographic algorithms, reviewed to ensure there are no "back doors," and examined to ensure that sensitive data is securely erased when it is no longer needed. Software that passes the complete review is issued a numbered certificate, which a software developer should be able to produce for you.

Other FIPS you may run across include FIPS 197, which covers the Advanced Encryption Standard (AES) encryption algorithm, and FIPS 180, which covers the SHA‐1 and HMACSHA‐1 encryption algorithms. Simply put, these standards—which cover governmentgrade security—can serve as an easy way to determine whether a product offers solid, validated encryption.

Where Was Your Encryption Made?

Use some caution when evaluating file transfer software that offers FIPSlevel encryption. Achieving FIPS certification is an expensive and timeconsuming process; some software developers prefer not to invest in it and instead purchase a pre‐made, FIPS‐compliant cryptography component from a third party. That's fine, but you need to ask some detailed questions and examine the software design to be sure that the integration between the file transfer software and the third‐party cryptography module is solid and secure. Whenever possible, I prefer software that's a little more end‐to‐end, at least when it comes to security, and to work with a vendor who has tightly integrated their cryptography code with their main software code. Especially when it comes to security, the "gaps" that can exist when integrating a thirdparty software module make me nervous.

Myth Busted

Clearly, all encryption is not equal or FIPS wouldn't exist. If data privacy is important to you, make sure you understand exactly what kind of encryption you're getting with a particular solution.

Myth 7: Security Is Just Encryption

This is usually the argument I get in response to my assertion that email isn't a safe, secure means of transferring files. "But if we encrypt it, then it's safe and secure." That's kind of true, as far as it goes. If you use, say, 256‐bit encryption on an email message, and you aren't concerned about timely delivery, then it doesn't matter if someone intercepts your email—they're probably going to require a few years, at least, to break the message. Fun studies have been done to guess how long it would take to crack encryption that used various lengths of keys—http://en.wikipedia.org/wiki/Brute_force_attack has some of the more impressive numbers, like the estimate that a 128‐bit key would take something like 1013 years to crack using brute‐force techniques. For the sake of argument, let's just say that 256‐bit encryption is, using current and foreseeable computing technology, invulnerable— something like 3x1051 years would be needed, and likely whatever was in the message would be pointless after all that time. So does that negate my argument about email not being a safe and secure file transfer mechanism?

No. Encryption, and the privacy it delivers, is not the sum total of security. Security is much more. In fact, as I noted earlier, privacy is not always the top security concern most businesses have.

Encryption protects you from eavesdropping, and that's important. But it will not stop someone from willfully releasing information to the wrong entity. Encryption won't keep a log of who sent the file. Encryption won't prove that the file actually got to where it was going. These are also security concerns, and if you're dealing with business‐class file transfer, they're capabilities you need. Let's take them in turn.

Non‐Repudiation and Guaranteed Delivery

"I sent it." "Well, I never got it." "Yes, you did." "Well, it was different than what you sent."

That's a conversation involving repudiation, or denying something—in this case, that a file was successfully delivered. There are certain types of business information that must be delivered, and you need to have proof that the delivery was completed—proof that can stand up to a possible denial by the recipient. In the physical world, we rely on express carrier tracking numbers, postal delivery receipts, and other devices; even then, it's hard to prove that a document was tampered with in transit, and almost impossible to prove the sender's identity.

So there's more to non‐repudiation than just proving delivery: Ensuring that the document that was sent is in fact the same document that was received—that it wasn't tampered with in transit. An aspect is the recipient's ability to validate the sender's identity—critical for ensuring that received documents are authentic and not forgeries.

Typically, all of these needs are realized through the use of digital signatures, which are created from asymmetric‐key encryption. In an asymmetric key, two actual encryption keys exist, and they are different. If something is encrypted with one of the two keys, only the other one can decrypt it—and vice‐versa. One key is typically kept private by the key holder—and is appropriately called the private key. The other, public key is readily available to anyone who needs it.

Here's how it works: The sender of a file uses a private key to digitally sign a file. This digital signature is a small piece of information that is encrypted using the private key; typically, the signature includes identity information about the sender, and checksum information that can be used to verify the file's integrity. The recipient obtains the sender's public key and attempts to decrypt the signature. If the decryption is successful, the sender's identity is verified—because only the sender, using the private key, could have encrypted the signature in the first place. The decrypted signature's information—such as a checksum—can be used to verify that the file itself was not changed in transit. This kind of cooperation between sender and recipient requires specialized software on both ends of the file transfer.

As always, there are fine details that you need to be aware of. Although cryptographic signatures are fairly standard, you should ensure that they use a high level of encryption to prevent tampering—anything FIPS 140‐2 compliant will do the job. Also, the checksum used to check the integrity of the transferred file is crucial. Older integrity checks use a cyclic redundancy check (CRC—you'll also see CRC‐32 or XCRC, two more recent variants). Those aren't ideal because they're fairly easily fooled. A better approach is to use a cryptographic hash, such as a SHA1 or MD5 hash. These produce a more reliable and tamperproof "fingerprint" for a file.

"Guaranteed delivery" also involves encryption. The recipient generates some kind of "delivery confirmation" and applies a digital signature to it. This allows the sender to further validate the recipient's identity, and gives the sender undeniable evidence of the successful delivery.

Auditing

Who is sending files is a big deal, especially if your organization is dealing with external security requirements and is subject to audits of your security practices. Auditing is a word that's actually used in a couple of ways:

  • It's the act of reviewing security logs and practices, typically by a human being who is checking for compliance with various regulations or requirements.
  • It's the act of saving security information to a tamper‐evident or tamper‐proof log, for use by those human auditors; this activity is more accurately referred to as logging.

File transfer solutions obviously don't do the first kind of audit, but they certain can and should do the second: Logging file transfer activity to a tamper‐proof or tamper‐evident log. Commonly, that log will either be some kind of proprietary database or an external relational database management system (such as Oracle or Microsoft SQL Server), and the anti‐tamper techniques usually involve more encryption.

There's a certain amount of trust that you have to place in the software. Obviously, unless you wrote the software, you can't really prove that it is logging every single action; you have to trust that it is doing so, and perhaps perform some tests to demonstrate to yourself that everything seems to be logged. There should not be an easy way to disable logging once enabled, and the ability to disable logging should be separated from anyone who uses the file transfer solution and might have cause to hide their activity at some point.

Authorization

Authorization is all about who is allowed to transfer files, what kinds of files they're allowed to transfer, when they're allowed to transfer, and what transfer options—such as whether they want encryption—they can specify.

This is obviously crucial for any kind of security concern. A good file transfer solution will understand this, and give you the ability to specify that these groups of users can perform these kinds of file transfers, and so forth, customizing the solution to fit our needs.

Retention

Retention is becoming increasingly important when it comes to file transfer. Typically, a file transfer solution needs to keep a copy of a file on‐hand until the transfer is complete. Whether it is receiving or sending a file, the file is going to exist—possibly in an incomplete state, if it's being received—on the file transfer solution's hard drive, or in its memory, for some period of time.

You should be able to specify rules that govern this retention. For example, you may want the solution to move completed, received files to an alternate location, then securely erase the solution's own copy of the file. You might want sent files to be securely erased from the solution as soon as the transfer is successful. However, you might want received files to be retained—as a form of backup—for several days, even if the files were moved elsewhere by an automated process.

The important thing is to make sure that whatever you're using for file transfers can do what you want—ideally without you having to write a bunch of homegrown scripts to accomplish something your solution can't.

Mapping Business Requirements to Technical Capabilities—Creating Your File Transfer Shopping List

By this point, you're probably ready to start considering a secure, managed file transfer solution for your company, or even for a specific department, division, or project. Before you start doing Google searches on "managed file transfer," however, you need to have a solid list of your requirements in mind. Although a lot of managed file transfer solutions are remarkably similar in basic capabilities, each of them does offer unique features that, depending on your needs, may be advantageous or disadvantageous to your organization. In this chapter, I'll examine specific business requirements that you may have and translate those to the technical requirements of a file transfer solution.

It's important for me to acknowledge that I can't determine which of these business requirements are important for your business; that's up to you. What I can do, however, is cover the ones that are important to a variety of businesses, explain why each one might be important, and let you use that information to construct your own shopping list for file transfer capabilities.

I'll use the term "shopping list" a lot in this chapter, and I use it very purposefully. Although I don't see a lot of actual lists at the grocery store any more, you should definitely create a written list of business requirements and technical features that are important to your business. I'll offer examples as I go, and you're welcome to use those as a sort of template for building your own list. Trust me—when it comes time to evaluate solutions, you'll appreciate having a detailed list handy.

Security

Security is the number‐one reason to invest in a managed file transfer solution. In fact, if security is of no concern whatsoever, then managed file transfer might not be in your future. However, as I outlined in the previous chapter, everyone should be concerned about security to some degree. With that in mind, let's look at some of the very specific security requirements that you might want for your managed file transfer solution.

Encryption

Do any of your data transfers need to be kept private? Again, this is something where I've never really met a company who could honestly and completely say, "No." Encryption of some kind or another always winds up on the requirements list for file transfer because every company has data they need to protect. Whether you're being driven by internal security policies or by external requirements (such as the legion of compliance acronyms— GLBA, SOX, HIPPA, PCI DSS, FISMA, and so on), privacy—and thus encryption—is a pretty common requirement.

This is a good time to start documenting the technical capabilities and business requirements you've identified for your company. Figure 3.1 is a sample "shopping list" that I use for almost all software evaluations and acquisitions; you can see that I document not only the technical capability but also the business requirement—and most important, the reason for the business requirement (in this example, because I need to comply with the Payment Card Industry Data Security Standard). That way, if I get into a detailed discussion with a vendor, I know exactly what's driving each decision. Vendors can help me make better choices when they know why I'm asking for a particular feature.

Figure 3.1: Starting a file transfer shopping list.

Note that I've also provided an importance indicator. I usually use a scale of 1 to 5, with higher numbers representing something of more importance. Again, this is a huge time saver for me when I start evaluating actual solutions because I'll be able to score them and rank them against one another based on how well they meet my most important business needs.

For now, I'm not necessarily going to worry about what kind of encryption. That's a detail I'll dive into in the next chapter, and in my example, the kind of encryption will be driven by my need to comply with PCI DSS. That's one reason to note that if I'm talking to vendors, I don't necessarily need to ask for "1024‐bit key encryption;" I can just say, "I need encryption that will be compliant with PCI DSS requirements." Most vendors have already done their homework in that area and know whether they're compliant, and there's no reason not to take advantage of the effort they've put into it.

Note

Annoyingly, most compliance efforts—especially those driven by legislation—contain no technical specifics whatsoever, meaning that even weaker 40‐bit encryption would technically satisfy the rules. The trick is that the rules only specify that you "keep the data private" or some other nontechnical phrase, meaning that if you do choose weaker encryption, and it's broken, you've failed to comply. That leaves you a bit on your own—which is why it can be nice to work with a vendor who understands the difficulty and can help guide and educate you. Remember, they've worked with hundreds of customers—they've probably run into your same questions over and over.

Non‐Repudiation and Delivery Tracking

As I described in the previous chapter, delivery tracking, guaranteed delivery, and nonrepudiation are important requirements for many companies. From a business perspective, these work together to ensure:

  • You can prove that a given file was in fact delivered to the intended recipient
  • The recipient can verify that the file was not altered in‐transit, and that what they have is exactly what you sent

Of course, if you're on the receiving end, the reverse is true: You want to confirm who the file came from, for example. Believe it or not, these capabilities are often requirements when you're dealing with industry and legislative compliance, although they may not always seem like an obvious requirement.

Here's why: Compliance rules such as HIPPA and PCI DSS require you to keep track of where your data goes and who accesses it. Once data is out of your hands, of course, you're no longer responsible—provided you played by the rules when transferring it to another party. Imagine an example where two banks are working together, and one needs to transfer credit card member data to the other. Both banks are aware of the PCI DSS rules, and the transfer takes place. A month later, it comes out that some of the cardholder data was leaked and Bank One, who owned the data in the first place, is under investigation. Well, maybe the problem wasn't at Bank One at all, but if they can't prove they transferred the data to Bank Two—encrypted and playing by all the other rules—then Bank One might still be on the hook for fines and penalties. Delivery tracking and non‐repudiation let Bank One prove that another, authorized party was involved, and the investigation can expand to include Bank Two. Compliance is, in many ways, about evidence, and these particular capabilities let you provide the additional evidence you need for many situations.

Logging

Aside from encryption, logging is easily the next most‐wanted security feature in a managed file transfer system. Businesses want to know who sent what file to whom, and when—and logging can provide that information.

This is another very important area where you need to be clear on exactly why you want logging:

  • Because administrators need to be able to troubleshoot file transfers, and the log will contain valuable troubleshooting details
  • Because compliance requirements mean you have to keep track of every movement of specific kinds of data
  • Because you want to implement service charge‐backs against departments that utilize the file transfer infrastructure
  • Because you need to support eDiscovery activities, meaning you will need to not only archive the logs but also the data that was transferred

These are all very different requirements. The middle one probably requires a tamperevident log so that any alterations to the log will be obvious and you'll know that the log has been compromised. The last one might require additional details like file sizes and transfer times so that you can have accurate charge‐backs. The reason for your logging requirement will be hugely important when you start looking at solutions and talking to vendors.

Authentication and Authorization

Authentication is the process by which users identify themselves to a computer system, such as by typing in a username or password. Authorization is the means by which a computer determines what an authenticated user is allowed to do—their permissions, in other words. Any managed file transfer system will support both authentication and authorization, but how they do it will vastly impact their usefulness to you.

Some solutions, for example, may only maintain their own internal database of user accounts, passwords, and permissions. That might be exactly what you need: a selfcontained solution. Other companies might prefer a solution that lets users authenticate with an existing identity, such as their Active Directory (AD) account. Some organizations may have internal policies that require strong, multi‐factor authentication such as biometrics or smart cards. Some solutions permit the use of different authentication mechanisms for different scenarios.

I find that the most flexibility comes from file transfer solutions that can connect to an external directory, such as AD or some other LDAP‐based enterprise directory. Why? Those enterprise directories do authentication. That's almost all they do. And so they do it well, meaning they can usually handle biometrics, smart cards, security tokens, or whatever other authentication mechanism you might need, either natively or via extensions. By letting the directory authenticate everyone, your file transfer solution can focus on file transfers.

Of course, if your favorite file transfer solution only supports internal usernames and passwords, all is not necessarily lost. Figure 3.2 illustrates an identity architecture that utilizes a metadirectory. This is designed to support environments that have multiple solutions—such as a file transfer solution and an Enterprise Resource Planning (ERP) solution—that maintain their own user directories. You manage identities in the metadirectory; it then replicates and synchronizes that information across all your subsidiary directories. Microsoft's Identity Lifecycle Manager (ILM) is one example of a metadirectory.

Figure 3.2: Using a metadirectory to synchronize identities.

If your organization already has a metadirectory, your "authentication" requirement might simply be for a file transfer solution that can have its user database managed by the metadirectory, if direct integration with something like AD isn't a possibility.

Remember

The real point is to make sure you're documenting the why behind each requirement. Don't just say, "Active Directory integration required;" also note that it's because "single sign‐on is an internal policy goal." Vendors may have other ways of achieving single sign‐on—the real requirement—that you haven't thought of.

Be very careful if you decide that single sign‐on is exactly what you want. In some cases, you can manage a file transfer solution more securely when it does not integrate with your existing directory. Single sign‐on is about convenience, but it can provide broad access to many users when that's not really a business requirement. For example, you might only want to grant users access to the file transfer system for a short period of time. With a nonintegrated user database, that's easier: Just create a file transfer user account within the file transfer solution, and perhaps even configure it to auto‐expire after a few hours or days. Because users aren't using the corporate‐wide directory identity, you can configure different parameters—such as stronger or weaker passwords, or account expiration—than what might be appropriate for a companywide identity.

Protecting Against Attacks

Everybody thinks they won't get attacked…until they are. Having a file transfer solution doesn't necessarily make you more prone or vulnerable to attacks, but it is another thing that can be attacked. Some file transfer systems are designed to sit firmly behind your firewall, which should do most of the protecting that's needed. Firewalls that provide specific support for stopping or reducing Distributed Denial‐of‐Service (DDoS) attacks, in particular, are valuable because the firewall takes the brunt of any attack and helps shield the file transfer solution. In these instances, the firewall's external IP address is usually the one that business partners and others will use; the firewall then passes that traffic—often after checking it for validity—to the file transfer solution. Figure 3.3 illustrates how this works.

Figure 3.3: Protecting the file transfer solution.

Other file transfer solutions are specifically designed for use in the "demilitarized zone," or DMZ, outside your main firewall. These can sometimes be sold as a customized build of an operating system (OS) like Linux, providing a more locked‐down and hardened server; these more‐secure file transfer solutions are built with the understanding that they're more exposed than ordinary servers, and often contain their own features for surviving attacks such as a DDoS attack. Still, smart companies will usually install these behind a firewall, creating a second "intranet" as their DMZ rather than installing servers (other than firewalls) that connect directly to the Internet. Figure 3.4 illustrates this approach.

Figure 3.4: Installing a file transfer solution in a firewalled DMZ.

Frankly, "the more firewalls the merrier" is my usual approach.

How Do You Stop a DDoS Attack?

When a solution—whether it's a firewall or a file transfer server—is designed to understand the potential for a DDoS attack, the response is usually an auto­ban. Computers are able to look at a source IP address very easily without incurring a lot of overhead. When a server determines that it's receiving a bit too much traffic from an IP address—as can occur during a DDoS or "hammering" attack—the server can simply instruct its network stack to drop all packets originating from the offending address(es).

This isn't always successful; some DDoS attacks "spoof" their source IP addresses to random ones, specifically to overcome the auto‐ban technique. But servers can also use a variety of other techniques to help defend themselves.

Other Security Concerns

There are a few other "miscellaneous" concerns that fall under the category of security. See if any of these might be requirements for your organization:

  • Auto­Expiration of Data. Many organizations don't want data sitting around on a file transfer server for any longer than it needs to, and some solutions offer rule sets that can automatically remove data when it's past a certain age. You might also want the information removed after a specified number of downloads, and depending upon your organization's security needs, might want to ensure that removal is done by means of a secure wipe rather than a simple deletion.
  • Secure Erasing. In order to transfer a file, a copy of the file needs to be on the file transfer server for at least a short period of time. Solutions that support secure erasing can ensure that no traces of files remain on the server once the file is completed and removed.
  • User Account Management. For file transfer solutions that maintain their own user databases, having the ability to expire user accounts at a certain date, auto‐expire accounts a certain number of days after their creation, enforce strong passwords, and so on can make for a more secure environment.

Deployment

How will you get your file transfer solution up and running? This isn't something a lot of companies consider up front, but how—and how quickly—you deploy a solution can have a major impact on your IT staff, your users, and other elements of your company.

Hosted vs. On‐Premises

There are two broad deployment scenarios for a managed file transfer solution, although not every vendor offers both. The first is hosted, meaning it's managed by a service provider or provided as "software as a service" (SaaS), and the second is on­premises. In both scenarios, the file transfer solution usually works exactly the same; the only difference is where it physically lives.

SaaS deployments are often fast; some vendors can have you up and running in a few hours or a few days. They require very little in the way of special skills or attention within your company, and might even be something you do with little or no IT staff involvement, which can be nice for a busy IT staff. You won't have to worry about patching or maintaining the solution, either; the hosting vendor will take care of all that (or should—be sure to check). You may pay more over the long run, but you're up and running quickly and your IT staff won't have much work to do. A SaaS deployment may not offer as much in the way of customization, and it may require some tweaks to your infrastructure so that users can communicate with, and use, the solution. Your data will be in someone else's hands, for however briefly, so some compliance‐sensitive data may not be suitable for an SaaS deployment.

On‐premises deployments mean you're installing the solution right in your own data center. You'll have full control over the process, and your data will be entirely under your control, which might be a must‐have feature if you're dealing with strict compliance requirements. However, you're going to be spending some IT staff cycles deploying and maintaining the solution, and you're going to have to invest in some infrastructure—like a server machine—to support the solution.

Neither hosted nor on‐premises is better than the other; they're different. Some companies start with a hosted deployment and then migrate the solution in‐house; if that's a possibility for you, be sure to include that fact in your shopping list and look for a file transfer vendor that has some experience with that scenario.

Needed Skills or Services

You'll need to work with your eventual file transfer vendor to determine what skills or services you'll need to deploy their solution. Some solutions can be deployed with skills you might already have on your IT team; others might require a consulting services engagement with the vendor. Consulting means less work for your staff, but more money; neither the "do it yourself" nor the "outsource it" approaches are inherently better than the other, but they are different. If your company has a preference, indicate that as a requirement on your shopping list.

Deployment Timeframe

How long will a solution take to deploy? In this instance, while I'm a big fan of working with vendors and relying on their expertise, I'd caution against taking the salesperson's word for it. Most are probably perfectly honest, but some might be tempted to fudge the numbers a bit if they sense a deal in the works. Instead, simply ask the vendor for a couple of references: customers who've implemented the solution and can tell you how long it took them. Of course you can expect to get "best case" numbers, but you can also ask if there were any particular challenges that you need to watch out for.

Note

One advantage of having the vendor's consulting arm deploy your solution is that they can often guarantee a timeframe because the deployment is more under their control. If time is a real factor for your deployment, allowing experts with plenty of past experience is definitely a way to put the project in the express lane.

High Availability

Do you need high availability? That depends: If your file transfer server goes down—and you have to assume it will eventually, if for no other reason than a failed power supply in the server or something—will your business suffer? If it will, then high availability needs to make it onto your shopping list.

Vendors accomplish high availability through varying means. In some cases, as shown in Figure 3.5, an external load balancer may be used to direct traffic to one server or the other; if the "active" server becomes unavailable, traffic is instead routed to a standby. Both servers access a common file system, often on a Storage Area Network (SAN), so they can take over for each other.

Figure 3.5: High­availability options.

In some cases, both servers can operate simultaneously, balancing workload between them; in others, one is active and the other is a passive standby. Some vendors implement this without the use of external devices or routers; others rely on OS features like Windows' Cluster Service to attain high availability. This is one of those areas that can contain a lot of subtle variations between vendors; don't just take "high availability" as a checkmark on your list. Find out how each vendor does high availability, and make sure you're happy with whatever answer they provide.

Integration

How well will a file transfer solution fit into your existing environment? If you're looking at a hosted deployment, this is less important; if you're going to be hosting the solution yourself, then making sure the solution works with what you already have in place is going to be pretty critical.

Supports Your Database?

Most file transfer solutions utilize a database of some kind to log activity, maintain their user database, store workflows and other customizations, and so on. Some solutions use an entirely‐internal database, which is fine; just make sure you understand what kind of maintenance that database will require, such as backups, data purging, and so on.

Other solutions use a standardized database platform, such as Microsoft SQL Server or Oracle. There's no one right answer, and no database is better than another for this purpose, but if you're an all‐Oracle shop, you might not be excited about having to support SQL Server. You might prefer to run the file transfer database on an existing database server, too, to conserve licensing costs; if that's the case, specify your preferred databases in your shopping list.

Be alert for solutions that use Microsoft SQL Server Express. Not that Express is a bad product, because it isn't, and using it as an "embedded" application database is exactly what it is intended for. It is not, however, natively free of maintenance: It needs to be backed up, it needs to have its performance tuned, and so forth. Many vendors who use Express will write their own code to handle these maintenance tasks, which is a great convenience. I'm not in any way suggesting that Express is a bad choice; if it is what a solution uses, however, just educate yourself about what maintenance—if any—it will require. I've run across a few lower‐end application developers who choose Express because it's free and easy, but they don't educate their customers about the maintenance requirements and before long the application isn't performing well.

Also note that Express is the same thing as the "full" edition of SQL Server; if you'd rather not have a solution use Express, but already have a SQL Server that could be used instead, ask the vendor if that's possible. It should be; aside from changing a connection string, the vendor wouldn't have to make any changes to enable compatibility with "full" SQL Server editions.

Works with Virtualization?

Is your company on the virtualization bandwagon? Will you run your file transfer solution in a virtual machine rather than on a physical one? If so, just make sure there aren't any issues in doing so, and that the vendor will support you in that scenario. Most should; some might feel that the virtual environment introduces variables that they don't want their support staff to have to deal with. Again, there's no wrong answer, just what's important to your organization.

Works with Client Computers?

Will the file transfer's client components work with all your client computers? If you're an all‐Windows shop, the odds are that the answer is "Yes" because everyone writes Windows‐compatible client components. However, if you've got non‐Windows stuff, you might want to make sure that there's a plan for it. Cross‐platform clients written in Java are one example; Web‐based clients are another.

Figure 3.6 shows my shopping list at this point. I want to point out a particular element— about halfway down, I have a requirement for a "Java‐Based Client." This is a common type of requirement that often comes out of internal discussions or discovery; I've noticed that my organization has a few Unix and Mac computers, so a Windows‐only solution might not be perfect for us. This isn't hugely important—only a 2 out of 5—but I want to make note of it anyway. Also notice that I've made some notes indicating that it's the multi‐platform aspect of Java I'm after, not necessarily a love affair with Java itself that's driving the requirement. I've also made a note that other cross‐platform alternatives might be acceptable, such as a Web‐based client. That's important because as I begin evaluating products, I have something other than "Java client" to look for. The otherwise‐perfect solution might not have a Java client, but it might well have a Web‐based client that does a just as good—or even better—job. By making sure I note the flexibility in my requirements,

I—or whoever is evaluating solutions—will be able to keep it in mind.

Figure 3.6: Finishing up my shopping list.

Workflow and Automation

One big reason—aside from security—to move to managed file transfer is the ability to have review/approval workflows and powerful automation tools. If you'll need these, then consider some of the common details in the next three sections.

Programming vs. GUI‐Based Workflow Building

How do you build workflow into a file transfer solution? Some provide graphical drag‐anddrop tools that let you build flowcharts, which the solution then follows. Others provide dialog‐based "wizards" that are, to my way of thinking, an even easier way to create custom workflows (I hate drawing flowcharts). Still others require you to master a scripting language of some kind, which might be right up your alley.

My general recommendation to my consulting clients is to aim for something graphical— either a workflow builder drag‐and‐drop system—or a dialog‐based "wizard" style interface that "interviews" you and builds a workflow based on your choices and answers. Both of these might well spit out a script in the background, but I don't generally recommend that you purchase a solution with the intent of building workflows purely in script. Being able to do so as an additional option is nice, especially for extremely complicated workflows, but in general, you'll enable more people to build and maintain workflows if you have less programming or scripting.

Automation

What sort of automation capabilities will you need? This is probably one area where you'll have to do the most investigation into how the solution will be used. Will it need to move files from place to place internally? Will it need to access databases? Figure out exactly what you'd like it to do, then see what solutions are out there that meet, or come close, to your needs.

Support for Delegation

Who will manage the file transfer solution? Here's where it's easy to make a mistake, so let me try and prevent that. A lot of companies purchase file transfer solutions to solve a particular divisional, department, or project‐based need. That means their delegation requirements are initially simple: Joey over in Sales will run this thing, and that's all we need. Very quickly, though, the file transfer solution starts to be used by other departments or projects, and Joey over in Sales gets too busy. At that point, you need more powerful and flexible delegation options, and you can't add that to a solution that doesn't have them in the first place.

Thus, with regard to "who will manage the file transfer solution," I advise you to imagine the solution being used by your entire company. Make sure it offers delegation that's flexible so that various people can be put in charge of various pieces or elements, and that "who is in charge" can easily be changed over time. You might not need that flexibility up front, but you likely will in the end.

Also—and this is critical if you're dealing with compliance requirements—make sure that the solution supports separation of duties. That is, pretty much nobody who manages the solution on a daily basis should be able to change auditing or logging settings, clear audit logs, and so on.

Programmability

You may have a need to integrate file transfer with other business applications or processes, and that's where programmability comes in.

APIs for Automation and Customization

If you need to—or think you may need to—integrate your file transfer solution with other applications, then make Application Programming Interfaces (APIs) an item on your shopping list. Typically, a file transfer solution API will allow another application to start, monitor, manage, or control file transfer operations. For example, you may have an application that generates a large amount of data; that application could use a file transfer API to automate the transfer of that generated data to its destination.

There are some obvious advantages to integrating a file transfer solution in this fashion; however, there are also some downsides. Once you've written code to a specific API, it becomes difficult and time‐consuming to rewrite them. Writing to a solution's API pretty much makes that solution a permanent feature of your environment; make sure you're willing to take that dependency before making the decision to use APIs.

Specific Programmability Needs

It's difficult for me to address all the needs that can arise around programmability; you need to look at your environment's own capabilities and needs. Will you need APIs that are accessible via Microsoft's .NET Framework? Windows' Component Object Model (COM)? Java? Something else? Make a list of the programming technologies already in use in your environment, and try to find a solution that natively supports those technologies, or can do so through a vendor add‐in or extension.

Protocols

This is probably one of the easiest requirements to map out, but do it in two parts. First, which protocols do you know for a fact you will need? Obviously, a solution is only useful if it supports those at a minimum. Then, look at every other protocol the solution supports, and try to get the solution that supports everything you know you need, and as many other protocols as possible. For example, you can use this as a checklist of what's possible:

  • AS1, AS2, and AS3
  • Local file system
  • Network shared folders (SMB)
  • FTP
  • FTPS
  • SFTP / SCP2
  • HTTP
  • HTTPS
  • SMTP

Why worry about things you don't need? Because you might need them in the future, and you'll want to have as much flexibility as possible.

External Connectivity

What types of file transfers will you need? In Chapter 1, I discussed the differences between user‐to‐user, user‐to‐system, and system‐to‐system transfers; each has slightly different techniques and requirements that you might want to consider.

User‐to‐User

These are the ad‐hoc transfers that users often accomplish through email attachments. Typically, file transfer solutions support this by providing an end‐user interface—either a standalone client utility or a Web interface or something—that talks to the central file transfer server to accomplish file transfers. Commonly, both users—that is, the sender and recipient—will have to have a specialized file transfer client. Different vendors accomplish this in different ways, using cross‐platform client software, Web applications, and so on. Some file transfer solutions may also support the use of email, allowing your internal users to specify a file to be transferred, then having the file transfer solution encrypt the file and attach it to an email. There's a lot of variety in how vendors approach this task, and it's an area where you'll want to ask a lot of questions and determine what's right for your company's needs.

Enforcing User­to­User Transfers

If user‐to‐user file transfers are a business need—and I haven't run into many businesses where they aren't—make sure that, in addition to acquiring a solution that supports them, you have the capability to enforce the use of that solution. In other words, you need to be able to turn off other ways of completing ad‐hoc user‐to‐user transfers. This might include restricting the use of instant messaging clients that allow file transfers, restricting the size of message attachments in your email system, and so on. You might also use your firewall to block ports commonly used for protocols like FTP, although you'll obviously need to provide an exception for the file transfer system itself. You might also block outgoing traffic in ports used by torrent and other file‐sharing clients—just to ensure that files aren't being transferred outside the managed system.

It's very, very difficult to block all forms of file transfer. Microsoft's Background Intelligent File Transfer (BITS) service, for example, can upload as well as download. BITS is used by Windows Update, so you don't usually want to disable the service outright; it uses HTTP, so it's difficult to separate BITS traffic from ordinary Web browsing. All things to keep in mind.

User‐to‐System

This isn't much different than a user‐to‐user transfer, although the "person" on the other end is usually a Web site or FTP server. That means less specialized software on the receiving end of the transfer. Again, vendors vary greatly in how they implement this, so ask questions of the vendor if you'll need to provide user‐to‐system transfers for your users.

System‐to‐System

These transfers are usually automated, between two servers—often via some form of FTP or via HTTP or HTTPS. This is the type of transfer that gets most companies looking at managed file transfer in the first place.

Evaluating and Selecting a Secure, Managed File Transfer Solution

In the previous chapter, I outlined some of the common business requirements surrounding managed file transfer. I explained where some of those requirements come from, and hopefully helped you figure out which requirements apply to your company and your particular situation. At this point, you should have constructed a "business requirements checklist," not unlike the one in Figure 4.1 (which is what I used to wrap up the previous chapter).

Figure 4.1: Sample business requirements for managed file transfer.

The idea is that this list should contain everything that's important to your business, some notes about why they're important, and an indication of exactly how important they are relative to one another. I use a scale of 1 to 5, with higher numbers representing more important capabilities. This list should focus on general business requirements, not technical features. I didn't specify what kind of encryption is important, for example, only that encryption in general is important because my organization needs to comply with PCI DSS. I've indicated that a hosted solution is pretty important, although I'd be okay with an onpremises solution if the vendor can provide implementation services—my company just doesn't have the time to deploy a solution on our own.

Conducting Your Evaluation

These business requirements form the list that you take to solution vendors, and you get them to show you how their solution meets those needs. Obviously, every solution is going to differ somewhat in how they implement each feature that covers your business needs; part of the evaluation is to decide which implementation you like the most.

Figure 4.2 shows how I like to score solutions during an evaluation: I list all of my business needs and their importance, then rate each solution on a scale of 1 to 5, with higher numbers indicating a solution that does a better job of meeting that business need. I keep a separate list of notes with the details behind why I awarded the score I did—sometimes, as you explore different solutions, something you see in "Product X" makes you change your mind about how you viewed "Product Y," and I can go back and adjust scores as needed.

Figure 4.2: Product feature comparison.

You'll find a template for this kind of product comparison in drawing products such as Microsoft Office Visio and ConceptDraw PRO; you could also whip up something similar in a spreadsheet like Microsoft Office Excel, if you preferred.

Here's how I use the chart:

  • Each product gets a score from 1 to 5, based on how well it supports my business need. I make pretty extensive notes that justify each score, along with explaining any subtle details I've picked up during my investigations.
  • I multiply each score times the importance of that business feature, giving each product a "total" for that feature. This helps weight more important features. A product that does a great job on something that's not very important to me won't overtake a product that does a "pretty good" job on something that's mission‐critical for me.
  • In the last column, I indicate which product(s) "won" for that business need. The last column lets me take a quick glance and see if any one product stands out—if I see mostly "A" in the last column, then product A is probably going to go on my short list for a lab trial or pilot project.

My job in this chapter is to help you understand some of the subtle details that come into play when evaluating solutions. As I pointed out in Chapter 2, for example, all encryption is not equal; in this chapter, I'll give you some additional pointers on what that means, and what to look for when you're looking at different solutions.

Beware the Details

At a high level—just naming off features like "encryption" and "protocol support"—you'll find that most managed file transfer solutions are practically identical. Or seem identical. Like most technology products, these vendors know what's going to be on your feature checklist, and they aim to have everything you'll need. But each of them goes about it in a different way, and some implementations might work better for you than others.

For example, in the previous chapter I mentioned that "high availability" can be implemented in a lot of different ways. Some vendors might build their solution on top of Microsoft's Windows Cluster Service; others might use their own high‐availability architecture. Neither one is wrong, but your company might not want to build a Windows Cluster to support a file transfer server—so that solution might not score as well for you. However, your company might have a bunch of existing Windows Clusters, and dropping a file transfer solution on to one of them might be a perfect fit—so that solution would score better with you.

It's those subtle details that make all the difference between solutions, and that's what this chapter is ultimately going to be all about. I want to emphasize that there's never a wrong answer in these kinds of details; there's only what's best for your particular situation. My goal is just to get you thinking about these details, so you can start deciding what might be best for you.

Strategic Tips

Before I start diving into feature details that you need to keep in mind, I want to cover a few broad strategic tips. As you look at different solutions, these tips are things you want to keep at the forefront of your mind at all times. In many cases, these strategic tips can change the way you view a product, help you expand your list of business requirements, and so forth.

Beauty Is Only Skin Deep

Let's face it, we all love a great‐looking user interface. Most of us use Windows, a Mac, or a Unix or Linux graphical desktop, and we tend to appreciate slick‐looking, graphical user interfaces (GUIs) . And there's nothing wrong with that. However, take a look at Figures 4.3 and 4.4—can you tell which of the two products is the better one?

Figure 4.3: Product A GUI.

Figure 4.4: Product B GUI.

Both are graphical FTP clients, but that's not actually the point. Product B is, for many people, the more attractive of the two. It has modern‐looking icons, matches the Windows XP color scheme, and so on. Product A seems like a blast form the past, with its 16‐color icons and Windows 95 color scheme. The point, however, is that you can't tell which one is better just by looking at the user interface. What really matters is the functionality under the hood. Beauty truly is only skin deep.

This is even truer for a managed file transfer solution. Keep in mind that the administrative user interface in particular needs to be functional, not necessarily beautiful. Much of a managed file transfer solution's work is done under the hood, out of sight; that's the functionality you should be evaluating.

It's easy for most software developers to create a slick‐looking user interface: They buy a pre‐built user interface library, add it into their project, and they're done. But that tells you nothing about what's under the hood. Different vendors and different development teams often have different priorities; one team might set aside time to get an all new user interface library integrated so that their product can look just like the latest version of Microsoft Office, with whatever neat new toolbars Microsoft has cooked up for that version. Other development teams may choose to use a simpler user interface library and instead set aside more time for server functionality—which is the approach I tend to favor as a customer because I'm shopping mainly for function, not beauty, in a server product. Neither team is wrong, but you must keep in mind that you'll be living with the product's functionality for a long time, and you can't make any judgments about it based upon the skin.

Buy for the Project, Plan for the Enterprise

I have a lot of conversations with my clients about the products they're considering, including managed file transfer solutions. In almost every conversation, my client is considering a file transfer solution to solve a particular project need or to meet the needs of a particular department. Getting companywide funding for software purchases can be difficult, so they tend to respond to smaller, project‐based budgets instead. That's a great tactic, but it presents a distinct problem.

As I've written before, I almost always see this happen: A department, division, or project identifies a need for a managed file transfer solution. In one case, it was to transfer customer orders to various vendors via secured FTP and secured HTTP. The IT department then becomes involved to help evaluate and select an appropriate solution. They properly consider every business need that the project has, and they select a solution that does an excellent job for that project. Next, another department, division, or project realizes that they, too, need managed file transfer. They come up with a list of their own business requirements, and get IT involved to find out what the company's existing solution can and cannot do.

In many cases, the new requirements don't exactly match the old ones—and in many cases, the solution the company bought for one project won't meet all the needs of another. The company then winds up buying another managed file transfer solution…and the IT department starts hating their lives because now they have to manage, support, and maintain two distinct solutions.

I always recommend to my clients that they at least think about other ways in which the company might use managed file transfer. You obviously can't go on an endless fact‐finding mission to gather business requirements that don't yet exist, but you can use your imagination and think about other areas, and what business requirements they might have. As you're evaluating solutions, always ask yourself, "Does this meet the current requirements?" but also ask yourself, "Can this grow to perhaps be used as a single, companywide solution?"

By stretching your parameters just a bit, you can help avoid a situation where you wind up with multiple solutions from multiple vendors. It may be worth spending a bit more, for example, on a solution that offers broader protocol support and programmability, even if those things aren't strictly on your current list of requirements, simply because those things will help the solution flex to meet unforeseen business requirements in the future.

When Is Software Like a Marriage?

When it costs you money and is incredibly difficult to change your mind. Any kind of server‐based, backend software solution represents a major commitment. You're going to be basing business processes on your managed file transfer solution. You're going to be teaching users how to use it. You're going to be paying money for it, and more money for ongoing maintenance. Changing your mind and switching to another solution is going to involve pain not unlike a divorce, as you migrate business processes, re‐train users, spend more money on a new solution and on maintenance for that, and so on.

I find it's cheaper and less exhausting in the long run to err on the side of caution during your first managed file transfer purchase. Spend a bit more, and get a solution that can grow to handle needs you haven't foreseen—it'll be cheaper than redoing everything later. Spend some time researching the vendor you're thinking about purchasing from, too— because you're marrying them just as much as you're marrying their software. Ask for references from other customers. Talk to colleagues at conferences and trade shows, and see if anyone else is working with that vendor. Ask yourself some questions:

  • How long have these people been in business? How stable are they? Do they have investors and money in the bank or are they teetering on the edge of solvency?
  • How long have they been working with managed file transfer? Are they established veterans in the field or are they newcomers? There's nothing wrong with new players in an industry, but you need to assure yourself that they're going to stick with it.
  • Do they seek out external partnerships with major vendors? Do they seek out independent testing and certification for their products? These are signs of healthy, competitive companies.
  • How quickly do they respond to bugs? Do they have a history of quickly releasing "hotfixes" or do bugs always take months to fix? Talking to existing customers is a great way to get this kind of insight.
  • How quickly do they respond to feature requests? Is their product development driven at least in part by what existing customers ask for or are they solely driven by what their marketing department thinks will sell new licenses? A good vendor will have a mix of customer‐driven and internally‐driven priorities; again, speaking with existing customers as well as with the vendor's development managers can provide insight on this.

Also try to find out a little about the history of the software you're considering. Was it developed in‐house? Was it acquired from another company? If it was acquired, was the development team also brought in‐house? How many people work on the software fulltime? Again, there are no wrong answers here, but the answers will help you gain a feel for where the product sits in the vendor's corporate hierarchy.

Criteria for Business Requirements

Now let's start diving into some of those subtle details I've been writing about. As you review your business requirements, consider the following additional, more‐specific criteria.

Security Requirements

Security is such a big driver for managed file transfer adoption, so I'm going to tackle this subject first. I have to warn you that this security stuff can get very detailed—that's kind of the whole point of IT security, I think—details. Yet this is also an area where some vendors occasionally "gloss over" a critical detail or two. In some cases, they do so because their product isn't quite on the leading edge; in other cases, it's because they genuinely misunderstand some of the fine details. Regardless, after reading the next few sections, you'll be able to lead the security conversation as you evaluate products.

Encryption Levels

When I first sat down to write this section, I had a whole ream of notes, covered in phrases like "256‐bit symmetric keys" and "1,024‐bit asymmetric keys." I even had a big argument with a colleague who asserted that 1,024‐bit asymmetric keys were unbreakable, despite the fact that cryptography professor Arjen Lenstra was asked if 1,024‐bit keys were dead and replied, "The answer…is an unqualified yes".

Looking at those notes, I realized that I'm not writing a book about encryption—although I certainly could, with all this material. You probably don't need or want to become an expert on encryption any more than I want to write a whole book about it. So let's find someone more paranoid, from a security perspective, then ourselves, and see what they've done.

For me, that would be the US government. If anyone is paranoid about keeping secrets, it would have to be them. And so they had the US National Institute of Standards and Technology (NIST) whip up not only standards for encryption but also tests to determine whether a given piece of technology meets those standards. I wrote about this in Chapter 2; what NIST came up with (the most recent version, that is—paranoia is always moving forward) is a standard called FIPS 140‐2. The Canadian Communications Security Establishment helps administer the testing program that goes along with FIPS, meaning you have not one but two governments involved—meaning the level of detail and accuracy has got to be pretty high.

For my money, I'd rather forget about figuring out encryption details and just go with whatever works for two of the world's larger governments. If the FIPS 140‐2 cryptography requirements are good enough to protect "Top Secret" information, then it's good enough to protect my patient records, financial information, customer information, or whatever else. So, when it comes to "encryption" on your list of business requirements, just make sure whatever you buy is FIPS 140‐2 compliance and certified. That "certified" bit is critical:

Make the vendor prove that their product has passed the governments' Cryptographic Module Validation Program (CMVP) tests. The vendor should have a NIST certificate number. Other standards to look for include FIPS‐192 for Advanced Encryption Standard (AES) algorithms and FIPS‐180 for SHA‐1 and HMAC‐SHA‐1 algorithms, some of the most important and commonly‐used encryption algorithms out there today.

If a vendor can prove that they've been validated against these FIPS standards, they should get a top score in your requirements scorecard. If they can't, you're going to have to become an encryption expert to determine whether their encryption is "good enough" to meet your needs. In fact, I typically make FIPS certification a "minimum point" in my criteria, meaning I won't even talk to vendors that don't have a FIPS‐certified product. Looking for FIPS certification makes my shopping easier, as the standard incorporates numerous things that I'd otherwise have to look for and evaluate on my own.

If encryption is something you're doing because you have an external requirement—like legislative or industry requirements—then bear something in mind: If the encryption you choose is ever compromised, the enforcing body is going to see that as your responsibility. "Why," they will ask, "did you not select something stronger?" However, if you've selected encryption that meets the government standard, it's hard to argue that you could have done better—and so you'll have covered your responsibilities.

Broad Security Capabilities

Review the details of your other security considerations. For example, will you need an insolution user database, the ability to integrate with an external directory, or a solution that can do both? Most file transfer implementations involve transfers to and from external partners, so the ability to create user accounts outside of your internal enterprise directory can be very useful; although it's obviously convenient if your internal users can authenticate using their corporate identity, you don't necessarily want vendors showing up in your Active Directory (AD) or other directory. In some cases, companies might want the file transfer solution to have a completely independent directory and not integrate with other corporate directories, especially if access to the file transfer system will be limited and/or tightly‐controlled.

Other security capabilities include non‐repudiation. Find out exactly how each solution achieves that, be sure that you can describe to each vendor your specific reasons for needing non‐repudiation, and the scenarios in which non‐repudiation will apply.

Anti‐Malware

It's rare, in my experience, for managed file transfer solutions to include their own antimalware capabilities. In fact, in my opinion, it's usually unnecessary; few of us need yet another anti‐malware product that needs to be continually updated and managed.

Instead, simply make sure that your existing corporate anti‐malware solution will work well with a proposed managed file transfer solution. This can be accomplished through a couple of means: Sometimes, as shown in Figure 4.5, anti‐malware scans occur at the corporate firewall before any data reaches the file transfer server. Firewalls may also perform scans on outgoing files, helping ensure that you're not transmitting viruses or spyware to your business partners. Bear in mind that firewall scans are usually not possible in the case of encrypted transfers simply because the firewall has no way of reading the data—that being the whole point of encryption.

Figure 4.5: Malware scans at the firewall.

In other scenarios, you may simply install a standard anti‐malware client on the file transfer server, letting that client scan files as they arrive on the server's file system—just as they would with any file server. File transfer servers typically keep files on their local file system while the transfer is in progress, and that can give an anti‐malware client an opportunity to scan the file before it's transferred elsewhere. Some file transfer servers may even provide anti‐malware integration points, where they explicitly request a malware scan upon completion of a transfer prior to moving the file or performing other actions. Just find out what kind of support each proposed file transfer solution offers.

High‐Availability Requirements

If you've identified high availability as a requirement, do some careful research into exactly how each file transfer solution provides high availability. Again, there are no wrong answers, but certain techniques will be more attractive than others based on your company's experience, existing infrastructure, and so forth.

One reason that high availability can be tricky is that managed file transfer solutions need a lot of data to work; in the event of a failure, a secondary or backup server needs access to all of that data. That data includes task automation instructions, encryption certificates and keys, scripts, and statistics, and so forth. Any kind of high‐availability system must involve some means for two servers to access that information. A solution built on the Windows Cluster Service, for example, accomplished this by storing information in an external drive, which is accessible to both servers. When one server stops functioning, the other can access all the needed information—but this technique has a dependency on Windows Server operating systems (OSs), compatible hardware, and the Windows Cluster Service.

Other solutions may use their own replication technologies to replicate needed information across the network to a backup server, as shown in Figure 4.6. Typically, some information is considered "private" and is not replicated—usually just basic configuration information that's handled during installation of the product. Both servers have the file transfer solution installed, in addition to some kind of replication service—which may be provided as an option by the file transfer solution vendor. Typically, administration is performed only against the "primary" server; that way, configuration changes can be replicated to the secondary by the replication technology.

Figure 4.6: High­availability architecture.

So what happens when a failover occurs? That depends on the solution, and you should ask vendors that exact question. Here's what's ideal:

  • If the primary server fails, the secondary server should pick up where it left off. Inprogress transfers should be resumed, if possible—meaning part of the data replicated from primary to secondary will include the actual files being transferred.
  • If the secondary server fails, the primary (assuming it's up) should continue functioning, and should queue replication updates so that when the secondary returns to service, replication can resume.

The idea is that a single server failure shouldn't impact operations for more than a few minutes, and little if any data or tasks should be lost or permanently interrupted.

Workflow Requirements

Aside from security, the need to implement workflows—review/approval cycles and other process‐oriented tasks—is one of the biggest drivers behind managed file transfer adoption. If workflow is part of your business requirements, take some time to understand exactly how much work is involved in creating a customized workflow. Remember: Technology should be driven by business, not the other way around; do not accept a file transfer solution that has fixed, non‐customizable process workflows. If the solution can't be made to model your processes, there's no reason you should change your business to model its processes.

Ease of Customization

How hard will it be to customize the workflow to your needs? This is the single biggest question around "workflow" as a business requirement. The following list highlights the three main techniques I've run across, in decreasing order of complexity:

  • Programming or scripting. This requires the biggest investment, the most specialized skills, and the highest cost of ownership and maintenance. It can also provide the most flexibility because, generally speaking, anything the product can do can be reordered however you require.
  • Drag­and­drop workflow. This is of middling difficulty and flexibility. Typically, the user interface involves something like drawing a Visio document: You drag workflow objects, like "approval" and "review" boxes, around in a flowchart‐style diagram. This setup offers good flexibility but requires a bit more work. Essentially, it's "graphical programming."
  • Dialog­driven workflow. This is typically the easiest to set up and maintain, and depending on how well it's implemented, can offer a remarkable amount of flexibility—some implementations I've seen can do anything you might imagine, in a fairly simple and intuitive interface. The usual technique has you walking through a sort of interview or "wizard," where you're asked how you want the workflow to work. Figure 4.7 shows what this might look like.

Figure 4.7: Creating a workflow using a dialog­driven interface.

The thing about "ease of customization" is this: The easier it is, the more people will be able to take on the work, and the less that will have to be dumped on already‐overburdened IT staffers.

The same criteria apply to creating new automated file transfer tasks, such as an automated transfer of data to a business partner on a scheduled basis. This should not involve scripting or programming; a dialog box, or a short series of dialog boxes, should be all that's necessary to set up new tasks. Again, you don't want to have to rely on skilled, busy IT staffers to set up new tasks in every situation. You might choose to have them perform that setup for policy reasons, but the easier it is, the more easily you'll be able to find people to handle it.

Limits on Number of Tasks

Be aware that some file transfer solutions place limits on the number of automated tasks you can create. Often, limits exist in lower‐priced "editions" of a product, with unlimited tasks being permitted in higher‐end editions. This is neither good nor bad; it's simply something you need to pay attention to. If you need relatively few tasks, and don't have any other reason to opt for a more expensive edition, you may be able to save money by using a lower‐priced edition. If you have a large number of tasks to automate, you may need to spend more. Knowing your task workload will help you and the vendor steer toward the right solution.

Always make sure that any limits on the number of tasks can be lifted, either by upgrading to a higher edition or by adding an extension or something. Ideally, you should be able to perform this "upgrade" simply by entering or installing a license key or extension; you should not have to re‐deploy a whole new product. If you're looking at a solution you love, and the only way it allows you to go from a limited number of tasks to an unlimited number is to re‐deploy, you might want to consider simply opting for the higher‐end edition to begin with.

Caution

Tasks aren't the only thing that lesser‐priced editions might limit, so be sure to ask about any other limitations. Remember, limitations can save you money, so don't opt for the high‐end edition simply because it's unlimited. Know your needs, and buy an appropriately‐sized product.

Canned Scripts and Macros

Even when a solution offers a great graphical interface for building automation and workflow, many of those interfaces produce scripts on the back‐end, which is what the solution's engine uses to execute those tasks. When that's the case, having access to a library of pre‐built scripts and macros can often shorten deployment and customization times. A lack of such pre‐built scripts certainly isn't a deal‐breaker, but it's nice to know if they're available and, if they are, what sort of capabilities they can help you achieve more quickly.

Programmability Requirements

As I described in the previous chapter, programmability is typically used to have external software run the file transfer solution. Typically, managed file transfer solutions have a number of built‐in capabilities: Obviously, transferring files through various protocols is a big one, but you'll also find data‐manipulation capabilities, encryption features, file compression features, and so on. The ability to access these features from within another application can be a powerful way to tightly integrate a file transfer solution into your existing business applications and processes.

Choice of API

If programmability is important to you, make sure you have—or are willing to acquire— the skills needed to work with the solution's Application Programming Interface (API). An API ties you to a specific programming language and environment; some solutions may provide multiple APIs in order to support a broader range of customers. Some examples include:

  • Microsoft .NET Framework
  • Sun Java
  • Microsoft Windows Component Object Model (COM—accessible from C++ and older scripting languages such as VBScript, and often from within .NET Framework applications)
  • Command line (usable within batch files and other system scripts)

It doesn't matter which you use—simply make sure that whatever is offered by the file transfer solution is something you're comfortable with.

Complexity of API

If a solution does include an API, how complex is it? Having your developers review the API documentation is the best way to find out. Simply make sure that your developers (or administrators because command‐line APIs are often scripted by administrators) are comfortable with what they see.

If there's any doubt, speak with the vendor—particularly with a development or product manager. In many cases, vendors have built their APIs based on customer requests, and in some of those cases, vendors are willing to extend their APIs based on further requests. Helping vendors understand how and why you plan to use the API will help them understand how they might need to modify it or allow them to point out alternatives that you may have overlooked or been unaware of.

Protocol Requirements

It should probably go without saying, but let's say it anyway: Make sure the file transfer solution you select will use the protocols you need. Beyond that, however, award extra points (you're still keeping your scorecard updated, right?) for additional protocols above and beyond the ones you need because you may need additional protocols in the future.

Having them built‐in to begin with will save you a lot of time and effort.

Choice of Protocols

The previous chapter provided a list of basic protocols to look for:

  • AS1, AS2, and AS3 secure file transfer
  • Local file system and network file copy (SMB and/or NFS)
  • FTP, including secure variants like FTPS and SFTP/SCP2
  • HTTP, including secured HTTPS
  • SMTP (email‐based transfers, which I'll briefly discuss next)

Simply supporting FTP isn't enough in today's world, mainly because FTP isn't secure by default. And be very cautious of the "secure FTP" variants; as I described in Chapter 1, there have been numerous "secure FTP" attempts. FTPS (which is FTP over SSL) and SFTP (which is FTP over SSH) are currently the most popular; I've seen variations called "SecureFTP" (which was proprietary to one vendor) and I've seen other nicknames for different variations. Make sure vendors are very clear on what they support, and ask them to specify specific standard protocols, especially with the FTP variations. "SFTP," for example, can mean:

  • The SSH File Transfer Protocol, part of the Secure Shell (SSH) suite of protocols
  • FTP over SSH, a normal FTP session running over a Secure Shell (SSH) connection (which provides the encryption)
  • Secure File Transfer Program, a Secure Shell (SSH) File Transfer Protocol client application
  • Simple File Transfer Protocol, which isn't secure at all (and which is pretty old)
  • Serial File Transfer Protocol, an older protocol used over RS‐232 serial interfaces

None of these are cross‐compatible, of course. Of them all, FTP over SSH is common and desirable, along with FTPS, which is the normal FTP protocol running over a Secure Sockets Layer (SSL) or Transport Layer Security (TLS) connection—basically, FTPS is to FTP as HTTPS is to HTTP. I've also seen FTPS referred to as "Secure FTP." Also common and desirable is another variation of SFTP—the one that means "SSH File Transfer Protocol," which has nothing to do with "classic" FTP and is also often called "Secure FTP."

I know—it's confusing and annoying, which is why you have to spend the time to find out from vendors exactly which bits they support. Why, with all the acronyms in the IT industry, we couldn't come up with ones that involve more than the letters F, T, P, and S in different combinations, I can't tell you.

Email as a Transport Mechanism

I like managed file transfer systems that include support for email (SMTP and POP3, usually) as a transport mechanism. Now, email isn't the most secure thing in the world whether it's encrypted or not, but everyone in the business world has an email address, so sometimes the broad availability and accessibility can override the less‐than‐secure nature of email (which I discussed at length in Chapter 2).

I commonly see email used in person‐to‐person transfers, where one user will use their managed file transfer system to create a secure email to an external user. Sometimes, some business partners may only be able to deliver data via email, so your file transfer solution has to actually log into a mailbox periodically, check for new files, then process them— that's where POP3 support comes in. Again, these aren't the optimal ways to move files from place to place, but sometimes you just have to, so I won't usually recommend a file transfer solution that doesn't support them.

Operational Requirements

Finally, your last set of requirements should focus on your own internal operational requirements. How well can a proposed solution be managed over the long term? Will it support other process and business requirements you may be subject to?

Audit Logging and Reporting

Call it "auditing" or "logging," too, if you prefer, but most companies—typically as part of a security effort—will want a managed file transfer solution that provides detailed logging. However, you need to be precise about what you want captured. Explain your security requirements—even if that explanation is simply, "we have to be Sarbanes‐Oxley compliant," and let vendors help you understand how they achieve that.

Some specifics to look for:

  • Every action in the file transfer solution should be logged or have the ability to be logged if logging is enabled. That includes configuration changes, task changes, file transfers of any kind, receipt of files, and so on.
  • You might want troubleshooting‐level logging, which means logging connection attempts and other low‐level functional details.
  • Log files—at least the ones related to who did what and when—need to go in some kind of secured, tamperproof or tamper‐evident database. Nobody should be able to clear or alter that log without having to jump through numerous hoops; you don't want someone covering their tracks by dumping the log.
  • Reporting can be incredibly useful and can help make a giant pile of log entries into something more useful. Reports may even be available for specific compliance efforts or for specific security scenarios. If reporting is a feature you need, also consider whether you need the ability to create custom reports.

Note

File transfer solutions that use an external database—such as Microsoft SQL Server or Oracle—offer an advantage in that the database can be accessed by any kind of reporting tool you might have. You'll obviously have to do more work to create custom reports in this fashion, but if you have extremely detailed reporting requirements, it's good to have that flexibility.

Monitoring

How will you monitor and maintain the file transfer solution? Some solutions may offer their own management console that includes monitoring capabilities, and in some environments, that may be all you need. Other environments may prefer to integrate the file transfer solution with an existing enterprise monitoring framework, such as IBM Tivoli, HP OpenView, Microsoft System Center Operations Manager, and so on. File transfer solutions may integrate directly with some frameworks or may integrate more generically through a common protocol like the Simple Network Management Protocol (SNMP).

Don't think that there's no need to monitor a file transfer solution; there is. You need to be alerted to problems, proactively alerted to pending problems (like low disk space), and so on.

Other Considerations

If you haven't already done so, start creating a list of usage scenarios for a managed file transfer solution. Describe, in narrative form if possible, some of the business situations where you would see file transfer coming into use. You might find that these situations— when you really start to think about the details—will help crystallize additional requirements, and you can add them to your list as you evaluate products. For example:

We have hundreds of subcontractors who need to transfer files to us. These files consist of in‐progress and completed work, and need to be transferred securely. We need to ensure that the transfer is encrypted, and that only authorized subcontractors can log in. Our subcontractors do not (and cannot) have Active Directory accounts in our domain.

This scenario reveals business requirements that you've no doubt already thought of: encryption, the ability to maintain an independent user database, and so on. But really think about how this might work in the real world. Will subcontractors have to download a special piece of client software in order to transfer files to you? Or will they use a Web interface? If they're using client software, what OSs do you need to provide a client for? As you conduct your evaluation, you don't need to have answers to these questions, but you should be asking these questions of the vendors you work with. How would their product support this scenario? In the end, you may be happy with several widely divergent answers from different vendors, which is fine—but the important thing is that you'll know how each of them supports this important business scenario.

Note

This example really stresses the need to talk to solution vendors. A lot of IT people would prefer to just find the information they need in a search engine or on a vendor's Web site—and you can certainly get a lot of information that way. But when you start diving into subtle, situation‐specific details, there's nothing like a phone call with someone who knows their product. I once had a scenario similar to the previous example and couldn't find anything about it on the vendor's Web site. Turns out they had an entirely separate product that supported this exact scenario and I just hadn't realized it. A day or so of investigation could have been replaced by a 10‐minute phone call.

Thanks for Reading

Well, that's it. Hopefully, I've helped you understand why secure, managed file transfer is such a great thing to have in your environment—how it can help offer you the security, the automation, and even the cost savings that you need. We've busted some myths about file transfer, and I've helped you figure out exactly which file transfer features are most important to you. In this chapter, I walked you through some of the things you'll need to consider when evaluating solutions, and I hope that you'll be able to really focus on some of the fine details and subtle differences when you get into your evaluation.

Corporate file transfer has come a long way from your basic FTP server; today's managed file transfer solutions offer better manageability, true workflow, powerful automation, and more. They offer a better cost of ownership than homegrown solutions, while bringing a set of capabilities that most companies really need today. Good luck with your managed file transfer projects.