वर्तनी पुस्तकालय (Hunspell की तरह)?

मैं लेखकों के लिए यूडब्ल्यूपी प्लैटॉर्म के लिए आवेदन भेज रहा हूं। मैंने जो पहेली छोड़ी है उसका एकमात्र टुकड़ा NHunspell लाइब्रेरी है। मैं इसे वर्तनी जांच और थिसॉरस सुविधाओं के लिए व्यापक रूप से उपयोग करता हूं। मैंने इसे बिल्ली से अनुकूलित किया है, और विभिन्न चीजों के लिए कस्टम शब्दकोश बनाए हैं (यानी प्रत्येक लेखन प्रोजेक्ट के लिए एक अलग शब्दकोश)। यह पुस्तकालय एक सुंदर चीज है।वर्तनी पुस्तकालय (Hunspell की तरह)?

हालांकि, मैं अपने यूडब्ल्यूपी आवेदन में इस डीएलएल को शामिल नहीं कर सकता।

1) क्या इस डीएलएल के उपयोग को मजबूर करने का कोई तरीका है? मैं वास्तव में ऐसा करता हूं कि NHunSpell प्रणाली कैसे स्थापित की जाती है। यह सामान्य ज्ञान बनाता है और उपयोग करने में बहुत तेज़ और आसान है।

2) यदि नहीं, तो क्या कोई कस्टम शब्दकोश, अनुकूलित वर्तनी जांच आदि के लिए बेहतर समाधान की सिफारिश कर सकता है?

अद्यतन 3

काफी अद्यतन और ऑनलाइन पढ़ने के बाद, मैं एक लिंक वर्तनी जांच के सिद्धांत पर चर्चा पाया। यहां एक त्वरित उदाहरण है (जिसे मैंने सबसे अधिक इस्तेमाल किया)।

http://www.anotherchris.net/csharp/how-to-write-a-spelling-corrector-in-csharp/

यह लेख पढ़, उस आधार कोड लेने, और Hunspell .dic फाइलों से अंग्रेजी शब्दों अलग करना के बाद, मुझे लगता है कि UWP में काम करता है अपने ही शब्द-जांच पुस्तकालय बनाया है।

एक बार जब मैं इसे ठोस बना देता हूं, तो मैं इसे एसओ समुदाय को दान करने के लिए नीचे दिए गए उत्तर के रूप में पोस्ट करूंगा। :)

अद्यतन 2

मैं Hunspell के उपयोग के दिए हूँ। ऐसा लगता है कि यह बिल्कुल संभव नहीं है ... क्या कोई अन्य पुस्तकालय/पैकेज हैं जो कोई भी सुझाव दे सकता है?

अद्यतन:

मैं शायद बयान है कि मैं DLL शामिल नहीं कर सकते अलग तरीके से व्यक्त करने की जरूरत है: मैं NuGet के माध्यम से DLL शामिल नहीं कर सकते। यह शिकायत करता है कि डीएलएल यूएपी/यूडब्ल्यूपी मंच के साथ संगत नहीं है।

मैं मैन्युअल रूप से में एक मौजूदा DLL (NuGet नहीं) से लिंक करके मेरी परियोजना में डीएलएल शामिल कर सकता हूं। हालांकि, डीएलएल वास्तव में यूएपी मंच के साथ असंगत साबित होता है। एक शब्द को वर्तनी जांचने के लिए एक सरल कॉल WinForms में ठीक काम करता है, लेकिन System.IO.FileNotFoundException के साथ तुरंत क्रैश हो जाता है।

एनएचनस्पेल का निर्माता संबंधित .dic और .aff फ़ाइलों को लोड करने के लिए पहुंचता है। हालांकि, मैंने फ़ाइलों को स्मृति में लोड करके इसे कम किया है और फिर वैकल्पिक कन्स्ट्रक्टर को कॉल किया है जो उन फ़ाइलों में से प्रत्येक के लिए फ़ाइल नाम के बजाय बाइट सरणी लेता है। यह अभी भी दुर्घटनाओं, लेकिन एक नए Method not found त्रुटि के साथ:

String System.AppDomain.get_RelativeSearchPath()

मैं किसी भी वर्तनी जांच इंजन कि UAP ढांचे के भीतर काम करेंगे रहा हूँ। मैं परिचितता के कारणों के लिए एनएचनस्पेल होने के लिए इसे पसंद करूंगा। हालांकि, मैं इस तथ्य से अंधे नहीं हूं कि यह एक विकल्प के रूप में तेजी से कम संभव हो रहा है।

जिन लोगों के साथ मैंने काम किया है, उन्होंने सुझाव दिया है कि मैं अंतर्निहित वर्तनी जांच विकल्पों का उपयोग करता हूं।हालांकि, मैं अंतर्निहित विंडोज 10/टेक्स्टबॉक्स वर्तनी जांच सुविधाओं (जिसे मैं जानता हूं) का उपयोग नहीं कर सकता, क्योंकि मैं कस्टम शब्दकोशों को नियंत्रित नहीं कर सकता और मैं ऑटो-कैपिटल और वर्ड-प्रतिस्थापन जैसी चीजों को अक्षम नहीं कर सकता (जहां यह आपके लिए शब्द को प्रतिस्थापित करता है अगर ऐसा लगता है कि यह सही अनुमान के करीब है)। लेखकों के लिए ये बातें अध्याय-आत्महत्या हैं! एक लेखक उन्हें ओएस स्तर पर बंद कर सकता है, लेकिन वे उन्हें अन्य ऐप्स के लिए चाहते हैं, बस यह नहीं।

कृपया मुझे बताएं कि क्या NHunspell के लिए कोई कार्य है। और यदि आप किसी काम के बारे में नहीं जानते हैं, तो कृपया मुझे अपने सर्वोत्तम प्रतिस्थापन कस्टम वर्तनी जांच इंजन को बताएं जो यूएपी ढांचे के भीतर काम करता है।

एक साइड नोट के रूप में, मैं अपने थिसॉरस क्षमता के लिए एनएचनस्पेल का भी उपयोग करता हूं। यह मेरे विंडोज़ ऐप्स में बहुत अच्छी तरह से काम करता है। मुझे इस कार्यक्षमता को भी प्रतिस्थापित करना होगा - उम्मीद है कि उसी इंजन के साथ वर्तनी जांच इंजन के रूप में। हालांकि, अगर आप एक अच्छे थिसॉरस इंजन के बारे में जानते हैं (लेकिन यह चेक वर्तनी नहीं करता है), यह भी अच्छा है!

धन्यवाद !!

स्रोत

2016-03-18 Jerry

क्या आप अधिक जानकारी प्रदान कर सकते हैं? क्या आप स्रोत से निर्माण कर रहे हैं? –

मैंने स्रोत से NHunspell नहीं बनाया था। पिछले रिलीज (यूडब्ल्यूपी नहीं) में, मैंने NuGet संकुल से NHunspell का उपयोग किया। मेरे यूडब्ल्यूपी ऐप के साथ NuGet मुझे बताता है कि NHunspell संगत नहीं है। – Jerry

NuGet सही है, आपको UWP SDK का उपयोग करके स्रोत कोड से लाइब्रेरी बनाने की आवश्यकता है। – yms

मैं NHunspell पुस्तकालय के स्रोत कोड डाउनलोड करने और मैं UWP समर्थन के साथ एक पुस्तकालय का निर्माण करने की कोशिश की, लेकिन मैं मार्शलिंग के साथ समस्याओं (Marshalling.cs)
पैकेज लोड DLLs कि केवल x86 और x64 में काम कर पाया आर्किटेक्चर, इसलिए बांह (मोबाइल, टैबलेट) में ऐप काम नहीं करेगा।
प्रणाली कॉल के साथ पैकेज लोड DLLs:

[DllImport("kernel32.dll")] 
    internal static extern IntPtr LoadLibrary(string fileName);

और मुझे लगता है कि यह UWP में काम करने के लिए फिर से लिखने, क्योंकि UWP एक सैंडबॉक्सिंग का उपयोग करता की जरूरत है।

आईएमएचओ केवल दो विकल्प हैं:
1) यूडब्ल्यूपी के प्रतिबंधों के साथ मार्शलिंग क्लास को पुनर्लेखन करें।
2) अपने प्रोग्राम में हनस्पेल का उपयोग न करें।

मेरे पास यूडब्ल्यूपी के साथ डीएलएस के बारे में कोई बड़ा ज्ञान नहीं है, लेकिन मेरा मानना है कि पुनर्लेख बहुत मुश्किल हो सकता है।

स्रोत

2016-03-20 16:22:32 ganchito55

मैं आपके प्रयास के लिए बहुत आभारी हूं! व्याख्या करने के लिए धन्यवाद। महान स्पष्टीकरण। क्या आप वर्तनी और थिसॉरस के लिए nhunspell के अलावा किसी भी पैकेज के बारे में जानते हैं? मैंने ऑनलाइन खोज की है और कुछ भी उपयोगी नहीं पाया है। (एपीसेल जैसे अन्य पैकेजों के संदर्भों में से बहुत सारे, लेकिन इनमें से कोई भी यूडब्ल्यूपी में काम नहीं करेगा) – Jerry

क्षमा करें, लेकिन मुझे अन्य लाइब्रेरी नहीं मिली जो न्हनस्पेल को प्रतिस्थापित कर सकती है। हालांकि मुझे लगता है कि आप nhunspell के साथ एक वेब विकसित कर सकते हैं और फिर आप एक वेब होस्टेड यूडब्ल्यूपी बना सकते हैं। शायद यह समस्या को हल कर सकता है – ganchito55

कोई समस्या नहीं। मेरा नवीनतम अपडेट देखें। हालांकि, मैं वास्तव में आपके इनपुट की सराहना करता हूं। – Jerry

जैसा कि वादा किया गया है, यहां कक्षा है जिसे मैंने अपनी वर्तनी जांच करने के लिए बनाया है।

using System; 
using System.Collections.Generic; 
using System.IO; 
using System.Linq; 
using System.Text; 
using System.Text.RegularExpressions; 
using System.Threading.Tasks; 

namespace Com.HanelDev.HSpell 
{ 
    public class HSpellProcess 
    { 
     private Dictionary<string, string> _dictionary = new Dictionary<string, string>(); 

     public int MaxSuggestionResponses { get; set; } 

     public HSpellProcess() 
     { 
      MaxSuggestionResponses = 10; 
     } 

     public void AddToDictionary(string w) 
     { 
      if (!_dictionary.ContainsKey(w.ToLower())) 
      { 
       _dictionary.Add(w.ToLower(), w); 
      } 
      else 
      { 
       // Upper case words are more specific (but may be the first word 
       // in a sentence.) Lower case words are more generic. 
       // If you put an upper-case word in the dictionary, then for 
       // it to be "correct" it must match case. This is not true 
       // for lower-case words. 
       // We want to only replace existing words with their more 
       // generic versions, not the other way around. 
       if (_dictionary[w.ToLower()].CaseSensitive()) 
       { 
        _dictionary[w.ToLower()] = w; 
       } 
      } 
     } 

     public void LoadDictionary(byte[] dictionaryFile, bool resetDictionary = false) 
     { 
      if (resetDictionary) 
      { 
       _dictionary = new Dictionary<string, string>(); 
      } 
      using (MemoryStream ms = new MemoryStream(dictionaryFile)) 
      { 
       using (StreamReader sr = new StreamReader(ms)) 
       { 
        string tmp = sr.ReadToEnd(); 
        tmp = tmp.Replace("\r\n", "\r").Replace("\n", "\r"); 
        string [] fileData = tmp.Split("\r".ToCharArray()); 

        foreach (string line in fileData) 
        { 
         if (string.IsNullOrWhiteSpace(line) || line.StartsWith("#")) 
         { 
          continue; 
         } 

         string word = line; 

         // I added all of this for file imports (not array imports) 
         // to be able to handle words from Hunspell dictionaries. 
         // I don't get the hunspell derivatives, but at least I get 
         // the root word. 
         if (line.Contains("/")) 
         { 
          string[] arr = line.Split("/".ToCharArray()); 
          word = arr[0]; 
         } 

         AddToDictionary(word); 
        } 
       } 
      } 
     } 

     public void LoadDictionary(Stream dictionaryFileStream, bool resetDictionary = false) 
     { 
      string s = ""; 
      using (StreamReader sr = new StreamReader(dictionaryFileStream)) 
      { 
       s = sr.ReadToEnd(); 
      } 

      byte [] bytes = Encoding.UTF8.GetBytes(s); 

      LoadDictionary(bytes, resetDictionary); 
     } 

     public void LoadDictionary(List<string> words, bool resetDictionary = false) 
     { 
      if (resetDictionary) 
      { 
       _dictionary = new Dictionary<string, string>(); 
      } 

      foreach (string line in words) 
      { 
       if (string.IsNullOrWhiteSpace(line) || line.StartsWith("#")) 
       { 
        continue; 
       } 

       AddToDictionary(line); 
      } 
     } 

     public string ExportDictionary() 
     { 
      StringBuilder sb = new StringBuilder(); 

      foreach (string k in _dictionary.Keys) 
      { 
       sb.AppendLine(_dictionary[k]); 
      } 

      return sb.ToString(); 
     } 

     public HSpellCorrections Correct(string word) 
     { 
      HSpellCorrections ret = new HSpellCorrections(); 
      ret.Word = word; 

      if (_dictionary.ContainsKey(word.ToLower())) 
      { 
       string testWord = word; 
       string dictWord = _dictionary[word.ToLower()]; 
       if (!dictWord.CaseSensitive()) 
       { 
        testWord = testWord.ToLower(); 
        dictWord = dictWord.ToLower(); 
       } 

       if (testWord == dictWord) 
       { 
        ret.SpelledCorrectly = true; 
        return ret; 
       } 
      } 

      // At this point, we know the word is assumed to be spelled incorrectly. 
      // Go get word candidates. 
      ret.SpelledCorrectly = false; 

      Dictionary<string, HSpellWord> candidates = new Dictionary<string, HSpellWord>(); 

      List<string> edits = Edits(word); 

      GetCandidates(candidates, edits); 

      if (candidates.Count > 0) 
      { 
       return BuildCandidates(ret, candidates); 
      } 

      // If we didn't find any candidates by the main word, look for second-level candidates based on the original edits. 
      foreach (string item in edits) 
      { 
       List<string> round2Edits = Edits(item); 

       GetCandidates(candidates, round2Edits); 
      } 

      if (candidates.Count > 0) 
      { 
       return BuildCandidates(ret, candidates); 
      } 

      return ret; 
     } 

     private void GetCandidates(Dictionary<string, HSpellWord> candidates, List<string> edits) 
     { 
      foreach (string wordVariation in edits) 
      { 
       if (_dictionary.ContainsKey(wordVariation.ToLower()) && 
        !candidates.ContainsKey(wordVariation.ToLower())) 
       { 
        HSpellWord suggestion = new HSpellWord(_dictionary[wordVariation.ToLower()]); 

        suggestion.RelativeMatch = RelativeMatch.Compute(wordVariation, suggestion.Word); 

        candidates.Add(wordVariation.ToLower(), suggestion); 
       } 
      } 
     } 

     private HSpellCorrections BuildCandidates(HSpellCorrections ret, Dictionary<string, HSpellWord> candidates) 
     { 
      var suggestions = candidates.OrderByDescending(c => c.Value.RelativeMatch); 

      int x = 0; 

      ret.Suggestions.Clear(); 
      foreach (var suggest in suggestions) 
      { 
       x++; 
       ret.Suggestions.Add(suggest.Value.Word); 

       // only suggest the first X words. 
       if (x >= MaxSuggestionResponses) 
       { 
        break; 
       } 
      } 

      return ret; 
     } 

     private List<string> Edits(string word) 
     { 
      var splits = new List<Tuple<string, string>>(); 
      var transposes = new List<string>(); 
      var deletes = new List<string>(); 
      var replaces = new List<string>(); 
      var inserts = new List<string>(); 

      // Splits 
      for (int i = 0; i < word.Length; i++) 
      { 
       var tuple = new Tuple<string, string>(word.Substring(0, i), word.Substring(i)); 
       splits.Add(tuple); 
      } 

      // Deletes 
      for (int i = 0; i < splits.Count; i++) 
      { 
       string a = splits[i].Item1; 
       string b = splits[i].Item2; 
       if (!string.IsNullOrEmpty(b)) 
       { 
        deletes.Add(a + b.Substring(1)); 
       } 
      } 

      // Transposes 
      for (int i = 0; i < splits.Count; i++) 
      { 
       string a = splits[i].Item1; 
       string b = splits[i].Item2; 
       if (b.Length > 1) 
       { 
        transposes.Add(a + b[1] + b[0] + b.Substring(2)); 
       } 
      } 

      // Replaces 
      for (int i = 0; i < splits.Count; i++) 
      { 
       string a = splits[i].Item1; 
       string b = splits[i].Item2; 
       if (!string.IsNullOrEmpty(b)) 
       { 
        for (char c = 'a'; c <= 'z'; c++) 
        { 
         replaces.Add(a + c + b.Substring(1)); 
        } 
       } 
      } 

      // Inserts 
      for (int i = 0; i < splits.Count; i++) 
      { 
       string a = splits[i].Item1; 
       string b = splits[i].Item2; 
       for (char c = 'a'; c <= 'z'; c++) 
       { 
        inserts.Add(a + c + b); 
       } 
      } 

      return deletes.Union(transposes).Union(replaces).Union(inserts).ToList(); 
     } 

     public HSpellCorrections CorrectFrom(string txt, int idx) 
     { 
      if (idx >= txt.Length) 
      { 
       return null; 
      } 

      // Find the next incorrect word. 
      string substr = txt.Substring(idx); 
      int idx2 = idx; 

      List<string> str = substr.Split(StringExtensions.WordDelimiters).ToList(); 

      foreach (string word in str) 
      { 
       string tmpWord = word; 

       if (string.IsNullOrEmpty(word)) 
       { 
        idx2++; 
        continue; 
       } 

       // If we have possessive version of things, strip the 's off before testing 
       // the word. THis will solve issues like "My [mother's] favorite ring." 
       if (tmpWord.EndsWith("'s")) 
       { 
        tmpWord = word.Substring(0, tmpWord.Length - 2); 
       } 

       // Skip things like ***, #HashTagsThatMakeNoSense and 1,2345.67 
       if (!tmpWord.IsWord()) 
       { 
        idx2 += word.Length + 1; 
        continue; 
       } 

       HSpellCorrections cor = Correct(tmpWord); 

       if (cor.SpelledCorrectly) 
       { 
        idx2 += word.Length + 1; 
       } 
       else 
       { 
        cor.Index = idx2; 
        return cor; 
       } 
      } 

      return null; 
     } 
    } 
}

स्रोत

2016-03-25 14:17:23 Jerry

आप सीधे निर्मित वर्तनी परीक्षक विंडो का उपयोग कर सकते हैं ताकि आप इसके व्यवहार को बेहतर तरीके से नियंत्रित कर सकें। और उसके बाद अपने परिणाम टेक्स्टबॉक्स नियंत्रण पर स्वयं लागू करें।

ISpellChecker पर एक नज़र डालें। यह आपको add अपना स्वयं का कस्टम शब्दकोश है और इसके व्यवहार को नियंत्रित करने के लिए बहुत अधिक विकल्प हैं। और हाँ, यह यूडब्ल्यूपी के लिए उपलब्ध है।

स्रोत

2016-05-09 08:21:39 Stefan

दिलचस्प विचार। मैंने इसे पहले देखा होगा, लेकिन इसके साथ मेरी दो प्रमुख चिंताएं हैं I 1) माइक्रोसॉफ्ट ने उस इंटरफ़ेस को उनके दस्तावेज़ में "उपयोग न करें" के रूप में चिह्नित करने के अपने रास्ते से बाहर निकला। कोई बड़ा सौदा नहीं, लेकिन मुझे चिंता है और 2) सबसे महत्वपूर्ण बात यह है कि, अगर मैंने इसे बनाया है, तो मुझे अंतर्निहित वर्तनी जांच कक्षा के बजाय * MY * वर्तनी जांच वर्ग का उपयोग करने के लिए अपने ऐप को बताने का कोई विशिष्ट तरीका नहीं मिला है। – Jerry

दस्तावेज़ कहते हैं "लागू नहीं करें", न कि "उपयोग न करें"। यह एक बड़ा, बड़ा अंतर है! – Stefan

यह सुनिश्चित नहीं है कि आपको अपनी खुद की वर्तनी जांच "कक्षा" की आवश्यकता है - इसके साथ आपको एक की आवश्यकता नहीं है। https://msdn.microsoft.com/en-us/library/windows/desktop/hh869748%28v=vs.85%29.aspx और यहां एक नमूना है: https://code.msdn.microsoft.com/windowsdesktop/वर्तनी-जांच-ग्राहक-aea0148c – Stefan

वर्तनी पुस्तकालय (Hunspell की तरह)?

उत्तर

संबंधित मुद्दे