*Update* : This 5 year post could not keep up with the latest version of firefox and gmail. Continue reading further without any hope of making it work.
Lately, I had the need to get a few of my chats for archiving offline, but not to my surprize, I found that GMail does not offer any such service, where I could get my data in XML/JSON or a compatible format.
So, I decided to start-off a small venture, to get this thing working.
After a small amount of reverse engineering, I figured that, every chat has a unique msgID, which can be used to get a neat and clean HTML formatted; printer-ready format. But for this to work, needed to fetch all the msgID.
The Trick
- Every Email is saved as a conversation, each conversation has a msgID and subsequent threadIDs
- When printing a mail, by its ThreadID, all the threads in the mail are concatenated as neat and clean HTML page
- Once, one reaches this Printer-Friendly page, it can be saved in plain HTML format
- Then, using MS Word, one can concatenate such files together, and in the end generate a nice index of emails.
The Prick
I Assume, you have already setup the required Label. If not, please get a hands-on on a tutorial to Gmail Labels, and once you master the art, come back to this page.
The following steps have been tested on Firefox 3.5.5.
STEP 1
Download and Install iMacros for your version of Firefox. The best way to do that is to Google for it !
Once you install iMacros, you will have a folder named iMacros inside your My Documents folder. Go inside that folder and the inside Macros.
STEP 2
Create the following files there ( with the attached content )
SaveMails.iim
VERSION BUILD=3700331
TAB T=1
TAB CLOSEALLOTHERS
CMDLINE !DATASOURCE mails.csv
SET !DATASOURCE_COLUMNS 1
'SET !LOOP 2
SET !DATASOURCE_LINE {{!LOOP}}
SET !EXTRACT ""
URL GOTO={{!COL1}}
SEARCH SOURCE=REGEXP:"((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\s][0-9]{1,2},[\s][0-9]{1,4}[\s]at[\s][\d]{0,2}:[\d]{0,2}[\s](AM|PM))" EXTRACT=$1
SAVEAS TYPE=HTM FOLDER=* FILE={{!EXTRACT}}
Links.iim
VERSION BUILD=6650406
TAB T=1
URL GOTO=javascript:javascript%3Am%3Dprompt(%22How%20many%20mails%20are%20their%20in%20this%20label%3F%22)%3Bu%3Dwindow.location.href%2Cs%3D%22%22%3Bl%3Du.substring(u.lastIndexOf(%22%3D%22)%2B1)%3Bu%3Du.substring(0%2Cu.lastIndexOf(%22%3F%22))%3Bfor(i%3D0%3Bi%3Cm%3Bi%2B%3D50)s%2B%3D(u%2B%22%3Fs%3Dl%26l%3D%22%2Bl%2B%22%26st%3D%22%2Bi%2B%22%3Cbr%3E%22)%3Bdocument.getElementsByName(%22q%22)%5B0%5D.value%3Ds%3B
ShowMessageIDs.iim
VERSION BUILD=3700331
CMDLINE !DATASOURCE links.csv
SET !DATASOURCE_COLUMNS 1
'SET !LOOP 2
SET !VAR1 {{!LOOP}}
ADD !VAR1
TAB OPEN
TAB T= {{!VAR1}}
SET !DATASOURCE_LINE {{!LOOP}}
URL GOTO={{!COL1}}
URL GOTO=javascript:s%3D%22%22%2Cu%3Dwindow.location.href%2Cs%3D%22%22%3Bl%3Du.substring(u.lastIndexOf(%22%3D%22)%2B1)%3Bu%3Du.substring(0%2Cu.lastIndexOf(%22%3F%22))%3Bt%3Ddocument.getElementsByName(%22t%22)%3Bfor(i%3D0%3Bi%3Ct.length%3Bi%2B%2B)s%2B%3D(u%2B%22%3Fv%3Dpt%26s%3Dl%26l%3D%22%2Bl%2B%22%26th%3D%22)%2B(t%5Bi%5D.value)%2B%22%3Cbr%3E%22%3Bdocument.getElementsByName(%22q%22)%5B0%5D%3D(s)%3B
STEP 3
Switch your Inbox to Basic HTML View ( The Link is at the bottom of the Page )
Make sure you are viewing the right label, i.e the one for which you want the mails to be archived.
STEP 5
In iMacros, locate the Links.iim file and play it. Enter the approximate number of email that this label has
Thereafter you will see a list of links on the page, select the whole text, and create a new file in "My Documents\iMacros\DataSources" named links.csv. Use NOTEPAD to do this, and not Microsoft Excel. Paste the text there and save.
STEP 6
Back to Firefox! Locate the Macro named show showMessageIDs.iim. Play (Loop) this Macro, and change the default value "3" to the approximate number of links you saved in STEP 5 (eg. 10, An approximate and large value would suffice.) .This will open a few tabs in your browser depending upon the number of mails you will be archiving.
STEP 7
Create a new File named mails.csv inside "My Documents\iMacros\DataSources" . Open this file in notepad, and then copy the contents of each TAB that opened from the previous step, into this file. Finally save this file.
STEP 8
Disable Javascript by unchecking Tools-->Options-->Content-->Enable Javascript in Firefox.
Now locate the Macro SaveMails.iim, and Play (Loop) this macro, to the number of Mails you wish to archive. An approximate and large value would suffice.
STEP 9
Once STEP 8 completes, ( which will take some time), you will have each of your mails in the folder "My Documents\iMacros\Downloads" . Now comes the more interesting part, Open Microsoft Office ( The code has been tested on Office 2007 ). Go to VIEW-->MACRO-->VIEW-->MACROS-->CREATE MACROS . Provide any arbitrary name, it doesn't matter !
A new Window will open
Select all the text and Delete it. Then paste the following Code
Sub StartMerge()
'
' StartMerge
' @Author Bageshwar Pratap Narain
' Run this Macros to select a Directory, and merge all the HTML files inside it, into a single file,
' and automatically create a Date Based Index
'
Folder = "c:\documents and setttings\bageshwar\"
Dim strPath As String
Dim strFile As String
Dim temp As String
Dim x As Integer
Dim doc As Word.Document
Dim current As Word.Document
'Set current = ActiveDocument
With Dialogs(wdDialogFileOpen)
.Name = "*.*"
If .Display = -1 Then
strPath = Options.DefaultFilePath(wdDocumentsPath)
End If
End With
Documents.Add Template:="Normal", NewTemplate:=False, DocumentType:=0
Set current = Windows(1).Document
'Windows(2).Activate
strFile = Dir(strPath + "\")
'MsgBox (strPath)
Do While strFile <> ""
x = x + 1
strFile = Dir ' Get next entry.
If strFile <> "" Then
temp = strPath + "\" + strFile
Set doc = Documents.Open(temp)
Selection.WholeStory
Selection.Copy
current.Activate
Selection.PasteAndFormat (wdPasteDefault)
doc.Close
End If
Loop
Set objWdDoc = Word.Application.ActiveDocument
' Set our range to be the entire document contents
Set objWdRange = objWdDoc.Content
' To be used for the result string
Dim Result As String
' Create a regular expression object.
Set regEx = CreateObject("VBScript.RegExp")
regEx.IgnoreCase = False
regEx.Pattern = ("((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\s][0-9]{1,2},[\s][0-9]{1,4}[\s]at[\s][\d]{0,2}:[\d]{0,2}[\s](AM|PM))")
Dim temp1 As Long
Windows(1).Activate
Selection.HomeKey Unit:=wdStory
Do
' Get the first match (Global = False, remember)
Set Matches = regEx.Execute(objWdRange)
' MsgBox (Matches.Count)
If Matches.Count = 0 Then
Exit Do
End If
' Get the first match from the MatchCollection.
Set Match = Matches(0)
'objWdRange.MoveStart(wdCharacter, Len(Match.value) + 1)
temp1 = objWdRange.MoveStart(wdCharacter, Match.FirstIndex + Len(Match.Value) + 1)
'MsgBox (Match)
stylize2 (Match)
Loop
' insertIndex
'
'
Selection.HomeKey Unit:=wdStory
Selection.InsertNewPage
Selection.TypeParagraph
With ActiveDocument
.TablesOfContents.Add Range:=Selection.Range, RightAlignPageNumbers:= _
True, UseHeadingStyles:=True, UpperHeadingLevel:=1, _
LowerHeadingLevel:=3, IncludePageNumbers:=True, AddedStyles:="", _
UseHyperlinks:=True, HidePageNumbersInWeb:=True, UseOutlineLevels:= _
True
.TablesOfContents(1).TabLeader = wdTabLeaderDots
.TablesOfContents.Format = wdIndexIndent
End With
End Sub
Sub stylize2(text As String)
'
' stylize2 Macro
'
'
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Style = ActiveDocument.Styles("Heading 1")
With Selection.Find.Replacement.ParagraphFormat
With .Shading
.Texture = wdTextureNone
.ForegroundPatternColor = wdColorBlack
.BackgroundPatternColor = wdColorBlack
End With
.Borders.Shadow = False
End With
With Selection.Find
.text = text
.Replacement.text = text
.Forward = True
.Wrap = wdFindAsk
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
With Selection
'If .Find.Forward = True Then
.Collapse Direction:=wdCollapseStart
'Else
' .Collapse Direction:=wdCollapseEnd
'End If
.Find.Execute Replace:=wdReplaceOne
'If .Find.Forward = True Then
' .Collapse Direction:=wdCollapseEnd
'Else
' .Collapse Direction:=wdCollapseStart
'End If
'.Find.Execute
End With
End Sub
FINAL STEP
Save the file and close the Macro Editor. Open MS Office, press ALT+F8, and double click on StartMerge.
Finally save the file, print it, eat it or throw it. Enjoy
~~~~~~~~~~~~H4x0r~~~~~~~~~~~~
ISSUES
The following issues remain unresolved
- The MS WORD Macros become very slow for more than 100 mails
- Need to save the mail files into groups,month wise
- The Regular Expression matching is extremely slow
No comments:
Post a Comment