Ball in your court

 

Embed or link this publication

Popular Pages


p. 1

musings on electronic discovery ball in your court april 2005 ­ january 2010 © craig ball the law technology news column ball in your court is both the 2007 and 2008 gold medal honoree as best regular column as awarded by trade association business publications international it s also the 2009 gold and the 2007 silver medalist honoree of the american society of business publication editors as best contributed column and their 2006 silver medalist honoree as best feature series and best contributed column about the author 2 the dna of data 4 unclear on the concept 7 cowboys and cannibals 10 give away your computer 12 don t try this at home 15 yours mine and ouch 17 the path to e-mail production 19 the path to production retention policies that work 22 the path to production harvest and population 25 the path to production are we there yet 27 locard s principle 29 a golden rule for e-discovery 31 data recovery lessons from katrina 33 do-it-yourself digital discovery 35 function follows form 38 rules of thumb for forms of esi production 42 ten common e-discovery blunders 46 ten tips to clip the cost of e-discovery 48 copy that 51 in praise of hash 54 santa@northpole.com 57 unlocking keywords 59 climb the ladder 62 vista changes the view 64 1

[close]

p. 2

getting to the drive 67 who let the dogs out 70 do-it-yourself forensics 72 do-it-yourself forensic preservation part ii 74 page equivalency and other fables 79 re-burn of the native 81 the power of visuals 83 well begun is half done 85 ask the right questions 87 crystal ball in your court 90 redaction redux 93 trying to love xml 95 the science of search 97 dealing with third-parties 99 tumble to acrobat as an e-discovery tool 101 grimm prognosis 104 brain drain 106 snafu 109 car 54 where are you 111 problematic protocols 113 right from the start 116 crystal ball 118 what lies beneath 120 the multipass erasure myth 122 don t touch that 125 special masters 128 surefire steps to splendid search-part 1 131 surefire steps to splendid search-part 2 135 all wet 139 tell ol yahoo let my e-mail go 142 jolly roger justice 145 the esis of texas 147 geek s gift guide 150 e-discovery bill of rights 153 about the author craig ball of austin is a board certified texas trial lawyer and an accredited computer forensics expert who s dedicated his career to teaching the bench and bar about forensic technology and trial tactics craig hung up his trial lawyer spurs to till the soils of justice as a court-appointed special master and consultant in electronic evidence as well as publishing and lecturing on computer forensics emerging technologies digital persuasion and electronic discovery fortunate to supervise 2

[close]

p. 3

consult on or serve as special master in connection with some of the world s largest electronic discovery projects and most prominent cases craig also greatly values his role as an instructor in computer forensics and electronic evidence to the department of justice and other law enforcement and security agencies craig ball is a prolific contributor to continuing legal and professional education programs throughout the united states having delivered over 600 presentations and papers craig s articles on forensic technology and electronic discovery frequently appear in the national media including in american bar association atla and american lawyer media print and online publications he also writes a multi-award winning monthly column on computer forensics and e-discovery for law technology news and law.com called ball in your court rated av by martindale hubbell and named as one of the best lawyers in america and a texas superlawyer craig is a recipient of the presidents award the state bar of texas most esteemed recognition of service to the profession craig s been married to a trial lawyer for 22 years he and diana have two delightful teenagers and share a passion for world travel cruising and computing undergraduate education rice university triple major 1979 law school university of texas 1982 with honors 3

[close]

p. 4

the dna of data by craig ball [originally published in law technology news april 2005 discovery of electronic data compilations has been part of american litigation for two generations during which time we ve seen nearly all forms of information migrate to the digital realm statisticians posit that only five to seven percent of all information is born outside of a computer and very little of the digitized information ever finds its way to paper yet despite the central role of electronic information in our lives electronic data discovery edd efforts are either overlooked altogether or pursued in such epic proportions that discovery dethrones the merits as the focal point of the case at each extreme lawyers must bear some responsibility for the failure few of us have devoted sufficient effort to learning the technology instead deluding ourselves that we can serve our clients by continuing to focus on the smallest stalest fraction of the evidence paper documents when we do garner a little knowledge we abuse it like the sorcerer s apprentice by demanding production of any and all electronic data and insisting on preservation efforts sustainable only through operational paralysis we didn t know how good we had it when discovery meant only paper however electronic evidence isn t going away it s growing exponentially and some electronic evidence items like databases spreadsheets voice mail and video bear increasingly less resemblance to paper documents proposed changes in the rules of procedure wending their way through the system require lawyers to discuss ways to preserve electronic evidence select formats in which to produce it and manage volumes of information dwarfing the library of congress litigators must learn it or find a new line of work my goal for this column is to help make electronic discovery and computer forensics a little easier to understand never forgetting that this is exciting challenging and very cool stuff accessible versus inaccessible you can t talk about edd today without using the z word zubulake pronounced zooboo-lake judge shira scheindlin s opinions in zubulake v ubs warburg l.l.c 217 f.r.d 309 s.d.n.y 2003 triggered a whirlwind of discussion about edd judge scheindlin cited the accessibility of data as the threshold for determining issues of what must be produced and who must bear the cost of production accessible data must be preserved processed and produced at the producing party s cost while inaccessible data is available for good cause and may trigger cost shifting but what makes data inaccessible is it a function of the effort and cost required to make sense of the data if so do the boundaries shift with the skill and resources of the producing party such that ignorance is rewarded and knowledge penalized to understand when data is truly inaccessible requires a brief look at the dna of data 4

[close]

p. 5

everything s accessible computer data is simply a sequence of ones and zeroes data is only truly inaccessible when you can t read the ones and zeroes or figure out where the sequence starts to better grasp this imagine you had the unenviable responsibility of typing the complete works of shakespeare on a machine with only two keys a and b and if you fail all the great works of the bard would be lost forever as you ponder this seemingly impossible task you d figure out that you could encode the alphabet using sequences of as and bs to represent each of the twenty-six capital letters their lower case counterparts punctuation and spaces the uppercase w might be abababbb and the uppercase s ababaabb cumbersome but feasible armed with the code and knowing where the sequence begins a reader can painstakingly reconstruct every lovely foot of iambic pentameter this is just what a computer does when it stores data in ones and zeroes except computers encode many alphabets and work with sequences billions of characters long computer data is only gone when the media that stores it is obliterated overwritten or strongly encrypted without a key this is true for all digital media including back up tapes and hard drives but inaccessibility due to damage overwriting or encryption is rarely raised as grounds for limiting e-discovery or shifting costs just another word for burdensome frequently lawyers will couch a claim of undue burden in terms of inaccessibility arguing that it s too time-consuming or costly to restore the data but burden and inaccessibility are opposite sides of the same coin and inaccessibility adds nothing to the mix but confusion arguing both burden and inaccessibility is two bites at the apple worse there is a risk in branding particular media as inaccessible parties resisting discovery shouldn t be relieved of the obligation to demonstrate undue burden simply because evidence resides on a back up tape we must be vigilant to avoid a reflexive calculus like all back up tapes are inaccessible inaccessible means undue burden presumed good cause showing required for production requesting party pays cost of conversion to accessible form zubulake put edd on every litigator s and corporate counsel s radar screen and proved invaluable as a provocateur of long-overdue debate about electronic discovery still its accessibility analysis is not a helpful touchstone especially in a fast-moving field like computing codifying it in proposed amendments to f.r.c.p rule 26b 2 would perpetuate a flawed standard even if that occurs don t be cowed by the label inaccessible and don t shy away from seeking discovery of relevant media just 5

[close]

p. 6

because it s cited as an example of something inaccessible instead require the producing party to either show that the ones and zeroes can t be accessed or demonstrate that production entails an undue burden 6

[close]

p. 7

unclear on the concept by craig ball [originally published in law technology news may 2005 a colleague buttonholed me at the american bar association s recent techshow and asked if i d visit with a company selling concept search software to electronic discovery vendors concept searching allows electronic documents to be found based on the ideas they contain instead of particular words a concept search for exploding gas tank should also flag documents that address fuel-fed fires defective filler tubes and the ford pinto an effective concept search engine learns from the data it analyzes and applies its own language intelligence allowing it to e.g recognize misspelled words and explore synonymous keywords i said sure and was delivered into the hands of an earnest salesperson who explained that she was having trouble persuading courts and litigators that the company s concept search engine worked how could they reach them and establish credibility she extolled the virtues of their better mousetrap including its ability to catch common errors like typing manger when you mean manager but when we tested the product against its own 100,000 document demo dataset it didn t catch misspelled terms or search for synonyms it couldn t tell manger from manager phrases were hopeless worse it didn t reveal its befuddlement the program neither solicited clarification of the query nor offered any feedback revealing that it was clueless on the concept the chagrined company rep turned to her boss who offered 100,000 documents are not enough for it to really learn the program only knows a word is misspelled when it sees it spelled both ways in the data it s examining and makes the connection the power of knowledge lies in using what s known to make sense of the unknown if the software only learns what each dataset teaches it it brings nothing to the party absent from the application was a basic lexicon of english usage nothing as fundamental as webster s dictionary or roget s thesaurus there was no vetting for common errors no fuzzy searching or any reference foundation the application was the digital equivalent of an idiot savant and i m taking the savant on faith because this application is the plumbing behind some major vendors products taking the fifth in the enron/andersen litigation i was fortunate to play a minor role for lead plaintiff s counsel as an expert monitoring the defendant s harvesting and preservation of electronic evidence the digital evidence alone quickly topped 200 terabytes far more information than if you digitized all the books in the library of congress printed out the paper would reach from sea-to-shining sea several times these gargantuan volumes and increasingly those seen in routine matters can t be examined without automated tools there just aren t enough associates contract lawyers and paralegals 7

[close]

p. 8

in the world to mount a manual review nor the money to pay for it of necessity lawyers are turning to software to divine relevancy and privilege but as the need for automated e-discovery tools grows the risks in using them mount it s been 20 years since the only study i ve seen pitting human reviewers against search tools looking at a paltry by current standards 350,000 page litigation database the computerized searches turned up just 20 percent of the relevant documents found by the flesh-and-bone reviewers the needle-finding tools have improved but the haystacks are much much larger now are automated search tools performing well enough for us to use them as primary evidence harvesting tools metrics for a daubert world ask an e-discovery vendor about performance metrics and you re likely to draw either a blank look or trigger a tap dance that would make the late ann miller proud how many e-discovery products have come to market without any objective testing demonstrating their efficacy where is the empirical data about how concept searching stacks up against human reviewers how has each retrieval system performed against the national institute of standards and technology text retrieval test collections if the vendor response is we ve never tested our products against real people or government benchmarks how are users going to persuade a judge it was a sound approach come the sanctions hearing we need to apply the same daubert-style standards [daubert v merrell dow pharmaceuticals 92-102 509 u.s 579 1993 to these systems that we would bring to bear against any other vector for junk science has it been rigorously tested peerreviewed what are the established error rates calibration and feedback like the airport security staff periodically passing contraband through the x-ray machines and metal detectors to check the personnel and equipment automated search systems must be periodically tested against an evolving sample of evidence scrutinized by human intelligence without this ongoing calibration the requesting party may persuade the court that your net s so full of holes only a manual search will suffice if that happens what can you do but settle thanks to two excellent teachers i read solzhenitsyn in seventh grade and joyce carol oates in the ninth i imagine that if i re-read those authors today i d get more from them than my adolescent sensibilities allowed likewise if software gets smarter as it looks at greater and greater volumes of information is there a mechanism to revisit data processed before the software acquired its wisdom lest it derive no more than my 11year-old brain gleaned from one day in the life of ivan denisovitch what is the feedback loop that ensures the connections forged by progress through the dataset apply to the entire dataset 8

[close]

p. 9

for example in litigation about a failed software development project the project team got into the habit of referring to the project amongst themselves as the abyss and the tar baby searches for the insider lingo as concepts or keywords are likely to turn up e-mails confirming that the project team knowingly poured client monies into a dead end if the software doesn t make this connection until it processes the third wave of data what about what it missed in waves one and two clearly the way the data is harvested and staged impacts what is located and produced of course this epiphany risk not realizing what you saw until after you ve reviewed a lot of stuff afflicts human examiners too along with fatigue inattentiveness and sloth to which machines are immune but we trust that a diligent human examiner will sense when a newly forged connection should prompt re-examination of material previously reviewed will the software know to ask hey will you re-attach those hard drives you showed me yesterday i ve figured something out concept search tools though judges and requesting parties must be wary of concept search tools absent proof of their reliability even flawed search tools have their place in the trial lawyer s toolbox concept searching helps overcome limitations of optical character recognition where seeking a match to particular text may be frustrated by ocr s inability to read some fonts and formats it also works as a lens through which to view the evidence in unfamiliar ways see relationships that escaped notice and better understand your client s data universe while framing filtering strategies i admire the way edd-savvy laura kibbe in-house counsel for pharmaceutical giant pfizer inc uses concept searching she understands the peril of using it to filter data and won t risk having to explain to the court how concept searching works and why it might overlook discoverable documents instead laura uses concept searching to brainstorm keywords for traditional word searches and then uses it again as a way to prioritize her review of harvested information for producing parties inclined to risk use of concept searching as a filtering tool inviting the requesting party to contribute keywords and concepts for searching is an effective strategy to forestall finger pointing about non-production the overwhelming volume and the limitations of the tools compel transformation of electronic discovery to a collaborative process working together both sides can move the spotlight away from the process and back onto the merits of the case 9

[close]

p. 10

cowboys and cannibals by craig ball [originally published in law technology news june 2005 with its quick-draw replies flame wars porn and spam e-mail is the wild west boom town on the frontier of electronic discovery all barroom brawls shoot-outs bawdy houses and snake oil salesman it s a lawless anyone-can-strike-it-rich sort of place but it s taking more-and-more digging and panning to get to the gold folks we need a new sheriff in town a modest proposal e-mail distills most of the ills of e-discovery among them massive unstructured volume mixing of personal and business usage wide-ranging attachment formats and commingled privileged and proprietary content e-mail epitomizes everywhere evidence it s on the desktop hard drive the server back up tapes home computer laptop on the road internet service provider cell phone and personal digital assistant stampede there s more to electronic data discovery than e-mail but were we to figure out how to simply and cost-effectively round up review and produce all that maverick e-mail wouldn t we lick edd s biggest problem the e-mail sheriff i envision is a box that pops up when you hit send and requires designation of the e-mail as personal or business-related if personal it s sent and a copy is immediately forwarded to your personal e-mail account the personal message is then purged from the enterprise system if business related you must assign the message to its proper place within the organization s data structure if you don t put it where it belongs the system won t send it tough love for a wired world on the receiving end when you seek to close an e-mail you ve read you re likewise prompted to file it within your organization s data structure deciding if it s personal or business and where it belongs when i first broached this idea to my e-discovery colleagues the response was uniformly dismissive our people wouldn t do it being the common reply hogwash they ll do it if they have to do it they ll do it if there s a carrot and a stick they ll do it if the management system is designed well and implemented aggressively i ask them why do you make employees punch in a code to use the photocopier but require no accountability for e-mail that may sink the company some claim our people will just call everything personal or file all business correspondence as `office general possibly but that means that business data will be notable by its absence from its proper place eventually the boss will say dammit dusty why can t you keep up with your e-filing in addition dusty won t want the system to report that he characterizes 95 of the at-work electronic communications he 10

[close]

p. 11

handles each day as personal in nature certainly there needs to be audit and oversight and the harder you make it to for a user to punt or evade the system the better the outcome this model worked for paper it can work for e-mail once a discovery request sent a file clerk scurrying to a file room set aside for orderly information storage there the clerk sought a labeled drawer or box and the labeled folders within he didn t search every drawer box or folder but went only to the places where the company kept items responsive to the request from cradle to grave paper had its place tracked by standardized compulsory practices correspondence was dated and its contents or relevance described just below the date files bore labels and were sorted and aggregated within a structure that generally made sense to all who accessed them these practices enabled a responding party to affirm that discovery was complete on the strength of the fact that they d looked in all the places where responsive items were kept by contrast the subject lines of e-mails may bear no relation to the contents or be omitted altogether there is no taxonomy for data folder structures are absent ignored or unique to each user most users e-mail management is tantamount to dumping all their business personal and junk correspondence into a wagon hoping the google cavalry will ride to the rescue the notion keep everything and technology will help you find it is as seductive as a dance hall floozy and just as treacherous e-discovery is not more difficult and costly than paper discovery simply because of the sheer volume of data or even the variety of formats and repositories those concerns are secondary to the burdens occasioned by the lack of electronic records management we could cope with the volume if it were structured because we could rely on that structure to limit our examination to manageable chunks satirist jonathan swift was deadly humorous when in his 1729 essay a modest proposal he suggested the irish eat their children to solve a host of societal ills but i m deadly serious when i modestly propose we swallow our reluctance and impose order on enterprise e-mail the payback is genuine and immediate tame the e-mail bronco and the rest of the herd will fall in line does imposing structure on electronic information erase the advantages of information technology is it horse-and-buggy thinking in a jet age no but it s has its costs one is speed if the sender or recipient of an e-mail is obliged to think about where any communication fits within their information hierarchy and designate a location that means the user has to pause think and act they can t just expectorate a message and hit send dare we re-introduce deliberation to communication the gun-slinging plaintiff s lawyer in me will miss the unvarnished res gestae character of unstructured email but in the end we can do with a little law west of the pecos 11

[close]

p. 12

give away your computer by craig ball [originally published in law technology news july 2005 with the price of powerful computer systems at historic lows who isn t tempted to upgrade but what do you do with a system you ve been using if it s less than four or five-years old and still has some life left in it pass it on to a friend or family member or donate it to a school or civic organization and you re ethically obliged to safeguard client data on the hard drive plus you ll want to protect your personal data from identity thieves and snoopers hopefully you already know that deleting confidential files and even formatting the drive does little to erase your private information it s like tearing out the table of contents but leaving the rest of the book how do you be a good samaritan without jeopardizing client confidences and personal privacy options one answer replace the hard drive with a new one before you donate the old machine hard drives have never been cheaper and adding the old hard drive as extra storage in your new machine ensures easy access to your legacy data but it also means going out-of-pocket and some surgery inside both machines not everyone s cup of tea alternatively you could remove or destroy the old hard drive but those accepting older computers rarely have the budget to buy hard drives let alone the technician time to get donated machines running donated systems need to be largely complete and ready to roll probably the best compromise is to wipe the hard drive completely and donate the system recovery disk along with the system notwithstanding some largely theoretical notions once you overwrite every sector of your hard drive with zeros or random characters your data is gone forever the department of defense recommends several passes of different characters but just a single pass of zeros is enough to frustrate all computer forensic data recovery techniques in common use free is good you can buy programs to overwrite your hard drive but why do so effective erasure tools are available as free downloads from the major hard drive manufacturers and most work on other manufacturers drives western digital offers its data lifeguard diagnostic tool at http support.wdc.com/download seagate s discwizard starter edition is found at www.seagate.com/support/disc/drivers/discwiz.html and maxtor s powermax utilities is found by drilling down from www.maxtor.com/support dban for darik s boot and nuke a free linux program will also obliterate all data on a windows system and is available at http dban.sourceforge.net each application offers bellsand-whistles but all you re seeking is the ability to create a boot floppy that can write zeroes to the hard drive if your system has no floppy drive each site also offers a boot cd image download 12

[close]

p. 13

why a boot floppy or cd because no wiping program running under windows can erase all of the data on a windows drive running under dos or in the case of dban linux insures that no file is locked out to the wiping utility while it does its job to this end check to be sure that whatever wiping application you select sees the entire hard drive if it only recognizes say the first 32 gb of a 40 gb drive check your settings or use a different utility fortunately these utilities are user-friendly and report what they see and do careful wiping every sector on a hard drive is a time consuming process allow hours of largely unattended operation to get the job done and if it s an option be sure to select a full overwrite or low level format and not a quick version there are no shortcuts to overwriting every sector to sterilize a drive check to be sure there is only one hard drive in the system if multiple drives are present wipe each of them above all understand that there is no turning back from this kind of data erasure no recycle bins no undo command no clean room magic be absolutely certain you have another working copy of anything you mean to keep an important courtesy when you sterilize a drive your privileged data obliterated along with the operating system and all applications a wiped drive can t boot a computer but can return to service if you remember to donate the system restore disk with the hardware for computers lacking restore disks supply the operating system installation disk and any application disks you wish to donate as long as you re not continuing to use the same applications loaded from the same disks or copies on your new machine your end user license is likely to be freely transferable if the donated system came without disks you or your recipient will need to contact the manufacturer and request a restore disk if as is often the case in larger firms the operating systems are site licensed it may be a violation of that license to share them your recipient will then need to purchase their own license or seek out someone who ll donate an operating system school districts typically have their own site licenses dodging blasts from the past be sure to caution your recipient that it s very important to promptly download critical security patches and service packs for the restored operating system and applications a restored machine is like a step back in time to when many now-closed security holes were wide open so the recipient needs to slam these vulnerabilities shut at the very first connection to the internet help for the helper worries about data security needn t keep you from helping others by donating your used computer for additional guidance contact techsoup www.techsoup.org or the national cristina foundation www.cristina.org and seek out or organize the computer donation program in your community 13

[close]

p. 14

breaking news clearing your donated sold or discarded hard drives of sensitive information isn t just good practice it s now also required by law effective june 1 2005 the federal trade commission s disposal rule 16 cfr part 682 requires businesses including lawyers and law firms to take reasonable measures to dispose of sensitive information derived from credit reports and background checks so that the information cannot practicably be read or reconstructed the rule which applies to both paper and digital media requires implementing and monitoring compliance with disposal policies and procedures for this information comments to the rule suggest using disc wiping utilities but also offer that electronic media may be economically disposed of by simply smashing the material with a hammer sounds like a great stress reliever but don t forget your safety goggles 14

[close]

p. 15

don t try this at home by craig ball [originally published in law technology news august 2005 the legal assistant on the phone asked can you send us copies of their hard drives as court-appointed special master i d imaged the contents of the defendant s computers and served as custodian of the data for several months the plaintiff s lawyer had been wise to lock down the data before it disappeared but like the dog that caught the car he didn t know what to do next now with trial a month away it was time to start looking at the evidence not unless the judge orders me to give them to you i replied the court had me act as custodian because the discoverable evidence on a hard drive lives cheek by jowl with all manner of sensitive stuff such as attorney-client communications financial records and pictures of naked folks engaged in recreational activity in suits between competitors intellectual property and trade secrets such as pricing and customer contact lists need protection from disclosure when not evidence as does all that full-of-surprises deleted data accessible by forensic examination even if the court directs me to turn over the drive images you probably won t be able to access the data without expert assistance i explained that like most computer forensic specialists i store the contents of hard drives as a series of compressed image files not as bootable hardware that can be attached to a computer and examined doing so is advantageous because the data is easier to access store and authenticate as well as far less prone to corruption by the operating system or through examination specialized software enables me to assemble the image files as a single virtual hard drive identical in every way to the original on those rare occasions when a physical duplicate is needed i reconstitute those image files to a forensically sterile hard drive and use cryptographic algorithms to demonstrate that the restored drive is a faithful counterpart of the original of course putting the digital toothpaste back in the tube that way takes time and costs money do we ask the court for a restored drive you could i said and you might get it if the other side doesn t object incredibly lawyers who d never permit the opposition to fish about in their client s home or office blithely give the green light when it comes to trolling client hard drives no matter how much you want to demonstrate good faith or that your client has nothing to hide be wary of allowing the other side to look at the drives 15

[close]

Comments

no comments yet

YOUBLISHER
About
What Others Say
Sitemap
Impressum

PUBLISHERS
Login
Signup
Tutorials
FAQ
Support

BUSINESS
Overview
Advertising
Support

DEVELOPERS
API

LEGAL
Report a Copyright Violation
Copyright FAQ
Terms of Use
Privacy Policy