Tuesday, August 19, 2008

ePUB from html - Book Glutton ok so far

Through the Wikipedia I have found Book Glutton and a handy service to create ePUB from html. I saved some test text in Open Office as html and then tried it. Headline in a different size, one word in bold.



This is how it looks as epub in Adobe Digital Editions. the bold is ok but it lost the size difference for the top line. OK as html in Firefox. So i think i will use bold for headings and just accept that the type is all the same size.

Acrobat.com has accepted the two files to store but it claims they cannot be displayed. Links should work so comment welcome.

the html

the epub

More later on what Adobe is doing, as and when it becomes just a bit more clear. The formats now are such that open source is well worth a look for comparison.

ePub from InDesign, could it be more intuitive?

On September 4th Waterstones in the UK will be stocking the Sony Reader. That includes Exeter, where I live. Often this blog is about distant places such as Ghent where the PDF standards come from. It will be interesting for me that local can connect with significant technology such as consumer electronics to display ePub.

I have booked some time on Sept 4th at Life Bytes, an internet resource opposite the Odeon on Sidwell Street. Advertised as "late afternoon", by which time there should be something working over the cable. The Kindle is not available in the UK anyway and there is no date as far as I know. So it still makes some sense to download files to a desktop and then cable to a device.

LifeBytes has a copy of the current InDesign and I have tried out the option to create ePUB. There is a file but no formatting. I might as well be working with Notepad, which by the way I often do. The menus are not that easy to follow. XHTML is shown under multimedia or something like that ( I am writing this at home so memory is not exact). So perhaps this idea of losing all formatting would make sense if you wanted to work in Dreamweaver. All I want to do is show a text with some formatting on a Sony Reader.

So it is good to have found a blog entry about Digital Editions even though I cannot follow all of it. The good news is that the formatting was intended to disappear so the explanation is not that I made some simple error in trying to use InDesign.

Text styling: only Character Styles, Paragraph Styles, and Object Styles are exported to ePub. All freehand styling is discarded. This means that if you want a single word or phrase in bold type, you need to create a character style (i.e. bold_text), and apply this character style to all of the text that you need to be bold. If you then decided that you want some of those words bold and italic, then you must create a second style to apply to those select words to be turned bold and italic.


Layout could be just a bit more complicated

Layout: when threading together text fields, they will always be exported in the correct order. However, they will also always be in one flow. All of the layout editing that you have done to place the text boxes with respect to each other or the page is discarded. You will have to style the layout of the ePub manually, after export.


But is still possible if you look inside the zip file


After exporting the eBook to ePub format, you can manually edit its content and styling. This is easy to do, because ePub format is really a zip file. To edit these components, follow these steps:

1. Using a zip compression/decompression tool, extract the contents of the ePub archive to a known location.
2. Apply the required edits to the individual components
3. Re-archive all of the components. The order of the files in the archive matters. In order to comply to the ePub specification, add the mimetype file first, and make sure that it is not compressed. Next, add the META-INF and the OEBPS directories to an archive.
4. Make sure that the extension of the archive file is .epub, not .zip.

Note that the XHTML files, chapter list (OPF file) and the CSS stylesheet can be found in OEBPS directory.


Fortunately the Sony Reader also supports Adobe Digital Editions(ADE) and PDF. What seems to happen is that a PDF file is seamlessly converted into Flash Paper as it used to be known. There is still no news on a MARS plugin for Acrobat 9 (expected July or early August according to the MARS blog.)So there appears to be no urgency in a more XML friendly version of PDF. If it is possible to move from PDF to Flash, could it be easier to create ePUB from text or word processing?

I have had a quick browse of open source approaches. More on this in a later post. Possible workflow- import open doc to Scribus, click the "save as ePUB" button.

Clearly life is more complicated than that, but I hope to have more clues by Sept 4th.

Sunday, August 17, 2008

John Dvorak on Adobe and Linux

John Dvorak has suggested that Adobe could do more with Linux as part of the situation around Microsoft and Silverlight. One thing that strikes me is that Dvorak must think that such a move is possible.

Previously it seems to have been considered that Linux may work on a server but is not ready for the desktop. As there are already a range of open source applications as presented at the Libre Graphics meeting, it seems possible to argue that they could soon compare with many of the functions from suites for creatives.

Perhaps the latest features would be missing or harder to implement. However I am still interested in the kind of capability associated with Adobe Classic - Postscript and PDF from long ago at the start of "desktop publishing". Well, maybe PDF is more recent but I will come back to some dates another time.

The ePUB format seems to be such that open source could cope with it. This could be one area where the Linux desktop was "good enough".

Fujifilm at Total Print

The Total Print site states that Fujifilm will be there but i can;t find anything else through Google News. Their inkjet is not expected to exist till 2010 but they also offer workflow. I have changed the story on WWWatford to include something with the Heidelberg aspect. It seems more likely to me that Heidelberg is there to get ready for the next phase of digital printing. I cannot see how litho can compete for the short runs typical of the kit that has been at this show so far. But the inkjet shown at drupa could compare on B2.

All the talk of workflow can be represented more simply as PDF and JDF. This could be boring but has not yet become commonplace.

This post will be updated or links shown in the comments. The text here will become a story for OhmyNews around the time of Total Print.

ePUB, EPUB, or .epub? to be continued

This post is a version of a draft of a story for OhmyNews around the time of the Online Information show in December. The comments will show links to later posts.

Last year I was invited to join the publishing panel when someone was a bit late. this year I hope to just ask questions. Such as what is XHTML? Why do they call it .epub or whatever? Why would InDesign lose all my formatting when I try to save as EPUB? Co I have to learn about Cascading Style Sheets? Could this be a bit more simple?

Similar questions for the London College of Communication Futures Conference in October at the time of Total Print so there should be some clues.

The Sony Reader will be available in the UK next month and even though there seems more US interest in the Kindle, there is no UK Kindle and also Waterstones still has credibility in the book world. Before December there could be interest in this area enough for somebody to explain how ePUB is open to most writers.

Meanwhile PDF is another option. There may be an explanation to be found on how Adobe thinks about all this. Still no MARS plugin for Acrobat 9 as far as I know. Possibly Adobe is so segmented marketingwise that it is left to users to make choices and a pattern will emerge later.

Previously, story for OhmyNews


Sony Reader Opens to EPUB Format for Digital Books

Thursday, August 14, 2008

Microsoft validates Flash

The summer is not over but I am trying to engage again with where technology is heading. I find Acrobat 9 fairly confusing still. Maybe the launch was at an odd time of year but there may be more to it. So far my impression is that there is not much new around PDF as such, the direction is still Flash.

Found a story in the Register explaining what Microsoft may intend. This could explain why Adobe is pushing Flash so urgently. A technology company usually works with technology that is not quite ready to be robust. A suitable version will emerge as the user requirement takes shape. I realise that most of the technology around PDF as in Adobe Classic is now very reliable and easy to clone as it should be for an ISO standard.

However I still find that there are few people using the Job Definition Format features in Acrobat. eBooks are still not recognised as such though there is masses of text online. I think this area is worth staying with for a while so have in mind the Online Information Show as a target date to have worked out more about flat text documents etc. Video is still a concern, but more for next year starting with BETT. The UK students are engaging with video, animation and all things Flash. The readers of Information World Review are still concerned with books and journals, my guess. And the Total Print Show may be going back to litho. Not sure about this.

I am also still confused as to what Adobe intends around MARS and the Digital Editions Reader. I had thought that an XML rewrite for PDF would be a sensible direction. Nothing much is happening however. Sony have announced support for EPUB in their Reader, available in the UK next month. XHTML, don't really understand it but this sounds good. Even if the tools are not available widely to create an eBook, PDF will also load in the Reader. The Kindle is more like a phone but the Reader seems enough like a book to make a point.

This post may seem to have gone off topic but the conclusion seems to be that whatever Adobe is trying to do might make more sense next year. Meanwhile most of the time i will be going back to text.

Saturday, August 02, 2008

Jim King blog continues

http://blogs.adobe.com/insidepdf/

This blog is not dead, but has been resting.

An update on standards and an explanation on why text is sometimes hard to extract. My own problem last week was that a cut and paste ended up with random characters i had never seen before. A font problem I should think.

So what about XHTML and why should we try out the Adobe Digital Editions? More on this after summer holidays.