亚洲十八**毛片_亚洲综合影院_五月天精品一区二区三区_久久久噜噜噜久久中文字幕色伊伊 _欧美岛国在线观看_久久国产精品毛片_欧美va在线观看_成人黄网大全在线观看_日韩精品一区二区三区中文_亚洲一二三四区不卡

CS5012代做、代寫Python設(shè)計(jì)程序

時(shí)間:2024-03-03  來源:  作者: 我要糾錯(cuò)



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標(biāo)簽:

掃一掃在手機(jī)打開當(dāng)前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計(jì)程序
  • 下一篇:AcF633代做、Python設(shè)計(jì)編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國(guó)家級(jí)風(fēng)景名勝區(qū)
    昆明西山國(guó)家級(jí)風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗(yàn)證碼平臺(tái) 理財(cái) WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號(hào)-3 公安備 42010502001045

    亚洲人成伊人成综合网小说| 日韩毛片高清在线播放| jizz在线观看视频| 国产精品家庭影院| 亚洲欧洲日本mm| 免费黄网站欧美| 国产精品99久久久久久久vr| 91偷拍一区二区三区精品| 亚洲免费看片| 黄页网站在线观看| 欧美高清在线一区| 最新欧美精品一区二区三区| jlzzjlzz亚洲日本少妇| 成人一区二区三区| 蜜桃视频一区二区| 天天操综合网| 国产高清视频一区二区| 国产系列电影在线播放网址| 97影院理论| 精品成人在线观看| 欧美性少妇18aaaa视频| 亚洲精品国产精品粉嫩| 深夜福利视频一区| 一区二区三区国产| 日韩高清欧美激情| 国产精品一区二区中文字幕| 99久久99久久精品国产片果冰| 美日韩中文字幕| 精品123区| 欧美日韩大片免费观看| 精品国精品国产自在久国产应用 | av一区二区三区在线| 精品国产乱码久久| 国产美女福利在线观看| 精品成av人一区二区三区| 日本欧美一区二区在线观看| 久久成人在线| 麻豆精品久久精品色综合| 成人免费va视频| 国产在线精品国自产拍免费| 成人久久18免费网站麻豆| 国内外成人在线| 美女精品视频在线| 国产精品蜜臀| 校园春色欧美| 精品日产卡一卡二卡麻豆| 亚洲一区二区精品3399| 日韩精品一区二| 先锋av资源在线| 污导航在线观看| 日韩伦理在线| 国产精品啊v在线| 亚洲欧美自拍偷拍| 国产精品一卡二卡三卡| 韩国av一区二区三区在线观看| 欧美日韩国产一级| 成人资源在线| 久久精品视频一区二区三区| 欧美日韩国产精品一区二区不卡中文| 制服黑丝国产在线| 午夜在线免费观看视频| 欧美v亚洲v综合v国产v仙踪林| 激情国产一区| 日韩欧美在线中文字幕| 初尝黑人巨炮波多野结衣电影| 欧美日韩在线精品一区二区三区激情| 亚洲男人的天堂在线aⅴ视频| 国产精品蜜臀在线观看| 亚洲精品免费一二三区| 精品久久久久久久久国产字幕 | 久久老女人爱爱| 亚洲va天堂va国产va久| 欧美三级免费观看| 在线看片福利| 国产精品扒开腿做爽爽爽软件| 亚洲国产成人精品视频| 日本成人精品| 欧美色电影在线| 国产伦精品一区二区三区视频 | 亚洲靠逼com| 国产一级性片| 久久免费电影| 在线亚洲观看| 午夜在线成人av| 新版的欧美在线视频| 91精品国产麻豆国产在线观看| 日韩欧美一区二区三区在线视频| 一区二区三区四区五区精品视频 | 精品久久中文字幕| 肉色欧美久久久久久久免费看| 美女久久精品| 国产午夜精品一区二区| h动漫在线视频| 精品系列免费在线观看| 性欧美超级视频| 91九色最新地址| 久久99精品久久久久久| av成人亚洲| 婷婷综合久久一区二区三区| 羞羞视频在线观看免费| 国模无码大尺度一区二区三区| 日韩一卡二卡三卡四卡| 高清不卡亚洲| 高清不卡一二三区| 色www永久免费视频首页在线| 久久这里只有精品视频网| 亚洲第一福利一区| 91热爆在线观看| crdy在线观看欧美| 日韩高清不卡在线| 精品久久久精品| 国产鲁鲁视频在线观看特色| 黄色成人精品网站| 超碰在线一区二区三区| 亚洲国产高清一区二区三区| 国产精品三级在线观看| 电影中文字幕一区二区| 欧美精品一二三| 国产精品videosex性欧美| 成人高清免费在线播放| 亚洲一区二区偷拍精品| 日韩午夜精品| av在线最新| 91精品福利在线| 久久精品主播| 黄黄的网站在线观看| 久久久91精品国产一区二区精品| 亚洲女人天堂在线| 日本在线高清| 亚洲靠逼com| 国产精品va视频| 欧美日免费三级在线| 精品国产欧美日韩一区二区三区| 国产高清精品在线| 日本在线观看视频| 国产精品成人午夜| 久久成人一区| 精品久久国产一区| 日本精品一级二级| 免费人成网站在线观看欧美高清| 日韩精品影片| 麻豆精品视频在线| 欧美tk丨vk视频| 精品电影一区| a国产在线视频| 在线精品亚洲一区二区不卡| 久久国产生活片100| 日韩理论在线| 国语一区二区三区| av中文资源在线资源免费观看| 羞羞视频在线免费看| 欧美日免费三级在线| 综合在线观看色| 免播放器亚洲| 欧美日韩免费观看视频| 天天做夜夜操| 中文字幕一区二区三三 | 色久视频在线观看| 久久久91精品国产一区二区三区| 福利欧美精品在线| 欧美蜜桃一区二区三区| 亚洲视频二区| 中文字幕有码在线观看| 黄色精品在线看| 亚洲一区二区三区在线| 成av人片一区二区| 蜜桃在线一区| h网站在线免费观看| 亚洲美女视频在线| 欧美女人交a| 成人做爰免费视频免费看| 成人av资源在线| 国产成人福利夜色影视| 在线a人片免费观看视频| 白天操夜夜操| 亚洲欧美日韩中文字幕一区二区三区 | 亚洲国产高清视频| 奇米四色…亚洲| 成人av电影在线网| 国产风韵犹存在线视精品| 国产精品综合av一区二区国产馆| 精品美女久久久| 国产偷倩在线播放| 可以在线观看的黄色| 久久久亚洲午夜电影| 美日韩一区二区| 欧美wwwww| 天天免费亚洲黑人免费| a天堂中文在线| 91精品免费在线| 久久精品视频免费| 亚洲区一区二| 亚洲激情播播| 欧美成人毛片| av在线日韩| av免费不卡| 中文字幕在线播放第一页| 亚洲成人在线免费| 国产精品久久久久久久午夜片| 国产精品久久久久久久岛一牛影视 |