Sunday, May 9, 2010

Archiving your Favourite Mails in Gmail


*Update* : This 5 year post could not keep up with the latest version of firefox and gmail. Continue reading further without any hope of making it work. 




Lately, I had the need to get a few of my chats for archiving offline, but not to my surprize, I found that GMail does not offer any such service, where I could get my data in XML/JSON or a compatible format.

So, I decided to start-off a small venture, to get this thing working.

After a small amount of reverse engineering, I figured that, every chat has a unique msgID, which can be used to get a neat and clean HTML formatted; printer-ready format. But for this to work, needed to fetch all the msgID.

The Trick

  • Every Email is saved as a conversation, each conversation has a msgID and subsequent threadIDs
  • When printing a mail, by its ThreadID, all the threads in the mail are concatenated as neat and clean HTML page
  • Once, one reaches this Printer-Friendly page, it can be saved in plain HTML format
  • Then, using MS Word, one can concatenate such files together, and in the end generate a nice index of emails.

 

The Prick

I Assume, you have already setup the required Label. If not, please get a hands-on on a tutorial to Gmail Labels, and once you master the art, come back to this page.

The following steps have been tested on Firefox 3.5.5.

STEP 1

Download and Install iMacros for your version of Firefox. The best way to do that is to Google for it !

Once you install iMacros, you will have a folder named iMacros inside your My Documents folder. Go inside that folder and the inside Macros.

STEP 2

Create the following files there ( with the attached content )

SaveMails.iim

VERSION BUILD=3700331

TAB T=1

TAB CLOSEALLOTHERS

CMDLINE !DATASOURCE mails.csv

SET !DATASOURCE_COLUMNS 1

'SET !LOOP 2

SET !DATASOURCE_LINE {{!LOOP}}

SET !EXTRACT ""

URL GOTO={{!COL1}}

SEARCH SOURCE=REGEXP:"((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\s][0-9]{1,2},[\s][0-9]{1,4}[\s]at[\s][\d]{0,2}:[\d]{0,2}[\s](AM|PM))" EXTRACT=$1

SAVEAS TYPE=HTM FOLDER=* FILE={{!EXTRACT}}


 

Links.iim

VERSION BUILD=6650406

TAB T=1

URL GOTO=javascript:javascript%3Am%3Dprompt(%22How%20many%20mails%20are%20their%20in%20this%20label%3F%22)%3Bu%3Dwindow.location.href%2Cs%3D%22%22%3Bl%3Du.substring(u.lastIndexOf(%22%3D%22)%2B1)%3Bu%3Du.substring(0%2Cu.lastIndexOf(%22%3F%22))%3Bfor(i%3D0%3Bi%3Cm%3Bi%2B%3D50)s%2B%3D(u%2B%22%3Fs%3Dl%26l%3D%22%2Bl%2B%22%26st%3D%22%2Bi%2B%22%3Cbr%3E%22)%3Bdocument.getElementsByName(%22q%22)%5B0%5D.value%3Ds%3B


 

ShowMessageIDs.iim

VERSION BUILD=3700331

CMDLINE !DATASOURCE links.csv

SET !DATASOURCE_COLUMNS 1

'SET !LOOP 2

SET !VAR1 {{!LOOP}}

ADD !VAR1

TAB OPEN

TAB T= {{!VAR1}}

SET !DATASOURCE_LINE {{!LOOP}}

URL GOTO={{!COL1}}

URL GOTO=javascript:s%3D%22%22%2Cu%3Dwindow.location.href%2Cs%3D%22%22%3Bl%3Du.substring(u.lastIndexOf(%22%3D%22)%2B1)%3Bu%3Du.substring(0%2Cu.lastIndexOf(%22%3F%22))%3Bt%3Ddocument.getElementsByName(%22t%22)%3Bfor(i%3D0%3Bi%3Ct.length%3Bi%2B%2B)s%2B%3D(u%2B%22%3Fv%3Dpt%26s%3Dl%26l%3D%22%2Bl%2B%22%26th%3D%22)%2B(t%5Bi%5D.value)%2B%22%3Cbr%3E%22%3Bdocument.getElementsByName(%22q%22)%5B0%5D%3D(s)%3B

STEP 3

Switch your Inbox to Basic HTML View ( The Link is at the bottom of the Page )



 
STEP 4

Make sure you are viewing the right label, i.e the one for which you want the mails to be archived.

STEP 5

In iMacros, locate the Links.iim file and play it. Enter the approximate number of email that this label has






 


Thereafter you will see a list of links on the page, select the whole text, and create a new file in "My Documents\iMacros\DataSources" named links.csv. Use NOTEPAD to do this, and not Microsoft Excel. Paste the text there and save.


 


 

STEP 6

Back to Firefox! Locate the Macro named show showMessageIDs.iim. Play (Loop) this Macro, and change the default value "3" to the approximate number of links you saved in STEP 5 (eg. 10, An approximate and large value would suffice.) .This will open a few tabs in your browser depending upon the number of mails you will be archiving.

STEP 7

Create a new File named mails.csv inside "My Documents\iMacros\DataSources" . Open this file in notepad, and then copy the contents of each TAB that opened from the previous step, into this file. Finally save this file.

STEP 8
Disable Javascript by unchecking Tools-->Options-->Content-->Enable Javascript in Firefox.
Now locate the Macro SaveMails.iim, and Play (Loop) this macro, to the number of Mails you wish to archive. An approximate and large value would suffice.

STEP 9

Once STEP 8 completes, ( which will take some time), you will have each of your mails in the folder "My Documents\iMacros\Downloads" . Now comes the more interesting part, Open Microsoft Office ( The code has been tested on Office 2007 ). Go to VIEW-->MACRO-->VIEW-->MACROS-->CREATE MACROS . Provide any arbitrary name, it doesn't matter !

A new Window will open




 

Select all the text and Delete it. Then paste the following Code


 

Sub StartMerge()
'
' StartMerge
' @Author Bageshwar Pratap Narain
' Run this Macros to select a Directory, and merge all the HTML files inside it, into a single file,
' and automatically create a Date Based Index
'
    Folder = "c:\documents and setttings\bageshwar\"
        Dim strPath As String
        Dim strFile As String
        Dim temp As String
        Dim x As Integer
        Dim doc As Word.Document
        Dim current As Word.Document
        'Set current = ActiveDocument

            With Dialogs(wdDialogFileOpen)
                .Name = "*.*"
                If .Display = -1 Then
                    strPath = Options.DefaultFilePath(wdDocumentsPath)
                End If
            End With
               
               Documents.Add Template:="Normal", NewTemplate:=False, DocumentType:=0
               Set current = Windows(1).Document
               
               'Windows(2).Activate
               
            strFile = Dir(strPath + "\")
            'MsgBox (strPath)
            Do While strFile <> ""
                x = x + 1
                strFile = Dir    ' Get next entry.
                If strFile <> "" Then
                    temp = strPath + "\" + strFile
                    Set doc = Documents.Open(temp)
                    Selection.WholeStory
                    Selection.Copy
                    current.Activate
                    Selection.PasteAndFormat (wdPasteDefault)
                    doc.Close
                End If
            Loop
    
    
            Set objWdDoc = Word.Application.ActiveDocument
        
        
    
        ' Set our range to be the entire document contents
        Set objWdRange = objWdDoc.Content
    
        ' To be used for the result string
        Dim Result As String
    
        ' Create a regular expression object.
        Set regEx = CreateObject("VBScript.RegExp")
        
        regEx.IgnoreCase = False
    
        regEx.Pattern = ("((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\s][0-9]{1,2},[\s][0-9]{1,4}[\s]at[\s][\d]{0,2}:[\d]{0,2}[\s](AM|PM))")
        Dim temp1 As Long
                
                Windows(1).Activate
                Selection.HomeKey Unit:=wdStory
                            
                Do
                        ' Get the first match (Global = False, remember)
                        Set Matches = regEx.Execute(objWdRange)
            
                         ' MsgBox (Matches.Count)
                         If Matches.Count = 0 Then
                            Exit Do
                        End If
    
            
                        ' Get the first match from the MatchCollection.
                            Set Match = Matches(0)
                             'objWdRange.MoveStart(wdCharacter, Len(Match.value) + 1)
                        temp1 = objWdRange.MoveStart(wdCharacter, Match.FirstIndex + Len(Match.Value) + 1)
                        'MsgBox (Match)
            
                        stylize2 (Match)
                Loop
    
                
                ' insertIndex
                '
                '
                    Selection.HomeKey Unit:=wdStory
                    Selection.InsertNewPage
                    Selection.TypeParagraph
                    With ActiveDocument
                        .TablesOfContents.Add Range:=Selection.Range, RightAlignPageNumbers:= _
                            True, UseHeadingStyles:=True, UpperHeadingLevel:=1, _
                            LowerHeadingLevel:=3, IncludePageNumbers:=True, AddedStyles:="", _
                            UseHyperlinks:=True, HidePageNumbersInWeb:=True, UseOutlineLevels:= _
                            True
                        .TablesOfContents(1).TabLeader = wdTabLeaderDots
                        .TablesOfContents.Format = wdIndexIndent
                    End With
    
                    
End Sub


Sub stylize2(text As String)
'
' stylize2 Macro
'
'
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Style = ActiveDocument.Styles("Heading 1")
    With Selection.Find.Replacement.ParagraphFormat
        With .Shading
            .Texture = wdTextureNone
            .ForegroundPatternColor = wdColorBlack
            .BackgroundPatternColor = wdColorBlack
        End With
        .Borders.Shadow = False
    End With
    With Selection.Find
        .text = text
        .Replacement.text = text
        .Forward = True
        .Wrap = wdFindAsk
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    With Selection
        'If .Find.Forward = True Then
            .Collapse Direction:=wdCollapseStart
        'Else
        '    .Collapse Direction:=wdCollapseEnd
        'End If
        .Find.Execute Replace:=wdReplaceOne
        
        'If .Find.Forward = True Then
        '    .Collapse Direction:=wdCollapseEnd
        'Else
        '    .Collapse Direction:=wdCollapseStart
        'End If
        '.Find.Execute
    
    End With
End Sub


 


 

FINAL STEP

Save the file and close the Macro Editor. Open MS Office, press ALT+F8, and double click on StartMerge.

Finally save the file, print it, eat it or throw it. Enjoy


 

~~~~~~~~~~~~H4x0r~~~~~~~~~~~~


 

ISSUES
The following issues remain unresolved
  • The MS WORD Macros become very slow for more than 100 mails
  • Need to save the mail files into groups,month wise
  • The Regular Expression matching is extremely slow
 

 

No comments:

Post a Comment