java - Remove XMP Metadata on PDF/A -


is there way wherein can remove xmp metadata on pdf/a document without removing pdf/a standardization?

i found using

pdfreader reader = new pdfreader(src); pdfdictionary dict = reader.getcatalog(); dict.remove(pdfname.metadata); dict.remove(pdfname.properties); reader.removeunusedobjects(); 

removes both xmp , pdf/a. there way remove xmp while retaining standard or reintroducing pdf/a processed document?

thanks.

you can't remove xmp information in pdf/a document; have found automatically invalidate pdf/a well. however, amount of information need retain in xmp container minimal.

it described in technical note: http://www.pdfa.org/publication/technical-note-tn0003-metadata-in-pdfa-1/

basically, boils down fact need retain pdf/a identification , conformance level; else can discarded. because we're talking xmp, have number of possibilities. 1 go through pdf library , deal way. second , potentially quickest , easiest use library supports reading/writing xmp in pdf, , replace xmp packet in file 1 has information need.

if (without hurting pdf file), shouldn't invalidate pdf or it's pdf/a compliance status (though surely advise test resulting pdf files using pdf/a validator make sure did right before using in production workflow).

there 1 caveat though , it's mentioned in technical note pointed above.

pdf/a-1 not require conforming document contain entries in document information dictionary @ all. nevertheless, whenever info en- tries specified in pdf 1.4 reference (except trapped entry) present, there must equivalent entry in document’s metadata, , both must match according provisions of pdf/a-1.

so... if document contains document properties, either have remove or match them in xmp packet.


Comments