Current Volume 9
Automatically finding meaningful differences between two versions of a document is a hard problem that current tools handle poorly. Tools that compare documents word-by-word or line-by-line cannot tell whether a rewritten sentence still means the same thing, nor can they spot a subtle but important change hidden in an otherwise unchanged paragraph. This paper presents the Semantic Document Evolution Tracker (SDET), a four-step system that breaks documents into paragraphs, converts each paragraph into a compact numerical fingerprint of its meaning using the all-MiniLM-L6-v2 Sentence-BERT model, finds the closest-matching paragraph between the two versions by comparing those fingerprints, and then passes the potentially changed pairs to the LLaMA 3.3 70B language model to decide whether each one is Added, Modified, or Deleted and how serious the change is. Tests on 120 hand-labelled document pairs from corporate policy documents and software requirement specifications showed a correct classification rate of 91% and a correct severity rating of 87%, beating a standard keyword-matching approach by 19 percentage points. A step-by-step removal test confirmed that every part of the system adds measurable value. The full system is deployed as a web service using FastAPI, SQLite, and Streamlit.
Semantic Change Detection, Contextual Embed-Dings, Sentence-BERT, FAISS, Llama, Document Versioning, Transformer Models, Cosine Similarity
IRE Journals:
Nikita Bachute, Prof. Mrs. Ashwini Garkhedkar "Semantic Change Detection in Evolving Documents Using Contextual Embeddings and Transformer Models" Iconic Research And Engineering Journals Volume 9 Issue 11 2026 Page 4659-4666 https://doi.org/10.64388/IREV9I11-1718361
IEEE:
Nikita Bachute, Prof. Mrs. Ashwini Garkhedkar
"Semantic Change Detection in Evolving Documents Using Contextual Embeddings and Transformer Models" Iconic Research And Engineering Journals, 9(11) https://doi.org/10.64388/IREV9I11-1718361