Searching in Excel files

I had to find a specific Microsoft Spreadsheet among thousands of files on a mapped network drive O: on a Windows 7 Computer. The problem was that searching and indexing was only performed on local disk like C: and D: so I could only search for keyword in local Excel files. It is not optimal, because it means copying thousands of temporary excel files to a local drive (D:), but as long as I know that it was a Excel file, I could copy only this file-types. But, when you need something, you need something. This is how I solved it.

– Mount the Windows share on a Linux server (requires sudo rights)

– Find the total size of all excel files on mapped network drive (so that you know the size of the total number of files that you have to put on your local Drive)

– Copy all excel files from share to local disk

– Let Windows 7 index the excel files locally (should happen automatically when new files are added)

– Finally, search for keywords in Windows Explorer and finding the Excel spreadsheet in question.

First mount the share:
sudo mount -t cifs -o username=,domain=example.com //WIN_PC_IP/ /mountdir

where mountdir is any name for a folder. The mount command will create it. It could be your username for instance.
WIN_PC_IP is the ip number of the Windows computer where your share is located.

To find the total size of the files:

cd mountdir

find . -name '*.xls*' -exec ls -l {} \; | awk '{ Total += $5} END { print Total}'

Then find all the excel files, and copy them to a new folder:
cd ..
mkdir EXCEL-FILES-FOLDER/
find . -iname '*.xls' -exec cp --parent {} EXCEL-FILES-FOLDER/ \;
find . -iname '*.xlsx' -exec cp --parent {} EXCEL-FILES-FOLDER/ \;

Copy the files to local D: drive (using CMD in Windows 7)

D:\>copy "o:\EXCEL-FILES-FOLDER\*" "d:\EXCEL-FILES-FOLDER\"

 
You might have to wait some hours, because if you have a lot of files, the Windows 7 computer might use a day or two before it is finished. “PATIENCE YOU MUST HAVE my young padawan”..

Leave a Reply

Your email address will not be published.