Programming Problems

DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 1 of 7
Task Summary
For this assessment, you will undertake a Twitter sentiment analysis using a N-Gram model as described in the article entitled ‘Deep Convolution Neural Networks for Twitter Sentiment Analysis’ by Zhao, Gui and Zhang (2018). You can access this article at: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8244338.
Use any two of the five datasets used in this paper and implement Twitter sentiment analysis using Python programming language. Identify and report on the similarities or dissimilarities of the outcomes for two different sources.
Please refer to the Task Instructions (below) for further details on how to complete this task.
Context
Twitter Sentiment Analysis is an automated process whereby text data from Twitter is analysed and segmented into different sentiments (e.g., positive, negative or neutral sentiments). Performing a sentiment analysis on data from Twitter using deep learning models can help organisations understand how people are talking about their brand.
ASSESSMENT 1 BRIEF
Subject Code and Title
DLE602 Deep Learning
Assessment
Programming Problems
Individual/Group
Individual
Length
500 words (+/–10%) and Source Code
Learning Outcomes
The Subject Learning Outcomes demonstrated by the successful completion of the task below include:
a) Build, train and apply deep learning models to real-world tasks.
b) Compare and select ways to pre-process signals, images, and texts for natural language, speech recognition, and computer vision applications.
Submission
Due by 11:55pm AEST/AEDT Sunday end of Module 4.
Weighting
30%
Total Marks
100 marks
DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 2 of 7
In the above-mentioned paper, Zhao, Gui and Zhang (2018) introduced the concept of using Deep Convolution Neural Networks for Twitter Sentiment Analysis. The authors also briefly described how the N-Gram model applies to the process. They conclude that Deep Convolution Neural Networks, which use pre-trained word vectors, can perform the task of Twitter sentiment analysis well. They used five different datasets to prove their point.
You will focus on the development of a basic Twitter Sentiment Analysis system using the N-Gram probabilistic language model. You will demonstrate your understanding of language processing models and your ability to develop systems using those models. You will also demonstrate your communication skills by drafting a short report.
Task Instructions
To complete this assessment task, you will need to read the article entitled ‘Deep Convolution Neural Networks for Twitter Sentiment Analysis’ (Zhao & Zhang, 2018) closely.
You are NOT expected to reproduce all the experiments completed in this paper. This paper is provided as a reference to enable you to better understand the context of this assessment and provide you with an idea of the quality of research papers that you need to read as part of Assessments 2 and 3.
The only task you are required to complete in this assessment is to develop a Twitter sentiment analysis technique that uses a N-Gram probabilistic language model.
Your aim is to be able to analyse any twitter texts and classify them into different sentiments, such as positive, negative or neutral sentiments.
If your N-Gram model (Bigram/Trigram) identifies one fourth of the words in a twitter text as positive, classify that twitter text as positive. If your N-Gram model (Bigram/Trigram) identifies one fourth of the words in a twitter text as negative, classify that twitter text as negative. For any other variation to these two scenarios, classify the twitter text as neutral. If you are using Bigram for positive twitter texts, use the same for negative twitter texts. Similarly, if you are using Trigram, use it for both positive and negative twitter texts.
The authors used five different datasets to prove their points. You need to use two of the five datasets to implement your solution. You do NOT have to use all five datasets.
Use Python as the programming language for this natural language processing assessment. The code must be well formatted and conform to Python naming conventions. You also need to provide sufficient comments in the code.
You are also required to prepare a 500-word report highlighting the similarities or dissimilarities of the outcomes from two different sources. You can choose to divide the word limit into multiple paragraphs. Include a short introduction with any critical points that will help your readers to understand the outcomes for your program. Then, briefly describe whether you see similar or
DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 3 of 7
different trends, in terms of positive, negative and neutral twitter sentiments, in both of your datasets. Discuss whether your program behaved in the same way for the different datasets.
You will be assessed based on the completeness of your model, the efficiency of the implementation, the coding convention, the quality of code and your articulation of the outcomes.
Finally, you will submit the source code. You must provide a link to the dataset used. Ensure that you include instructions on how to run your code at the top of your main source code file inside a comment block.
Referencing
It is essential that you use APA style to cite and reference your research. For more information on referencing, visit our Library website at: https://library.torrens.edu.au/academicskills/apa/tool.
Submission Instructions
Submit the source code, the link to the dataset you used and your 500-word report in a zip file via the Assessment 1 link in the main navigation menu for ‘DLE602: Assessment 1’ by 11.55 pm AEST on the Sunday at the end of Module 4 (Week 4).
The Learning Facilitator will provide feedback via the Grade Centre in the LMS portal. Feedback can be viewed in My Grades.
References
Zhao, J., Gui, X. & Zhang, X. (2018). Deep convolution neural networks for Twitter sentiment analysis. IEEE Access, 6, 23253–23260. Retrieved from https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8244338
Jurafsky, D. & Martin, J .H. (2008). Speech and language processing. Boston, MA: Pearson. Retrieved from https://web.stanford.edu/~jurafsky/slp3/3.pdf
Academic Integrity Declaration
I declare that except where I have referenced, the work I am submitting for this assessment task is my own work. I have read and am aware of Torrens University Australia Academic Integrity Policy and Procedure viewable online at http://www.torrens.edu.au/policies-and-forms
DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 4 of 7
I am aware that I need to keep a copy of all submitted material and their drafts, and I will do so accordingly.
DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 5 of 7
Assessment Rubric
Assessment Attributes
Fail
(Yet to achieve minimum standard)
0–49%
Pass (Functional)
50–64%
Credit (Proficient)
65–74%
Distinction (Advanced)
75–84%
High Distinction (Exceptional)
85–100%
Completeness and efficiency
• The implementation covers all requirements.
• The whole system is easy to use and run.
Percentage for this criterion = 25%
None of the requirements have been implemented.
The system does not function properly or is extremely buggy.
An extreme level of manual configuration is required to run the system. Additionally, the configuration does not work.
One or two major requirements have been implemented.
The system does not function properly. No exception handling has been implemented.
Users are required to follow a lengthy configuration manual to run the system.
All but one or two major requirements have been implemented.
The system functions only if certain additional conditions are met. Basic exception handling has been implemented, but it is not thorough.
Users are required to follow a short configuration manual to run the system.
Most of the major requirements have been implemented.
The system functions without any additional conditions needing to be met. Basic exception handling has been implemented, but it is not thorough.
Users are only required to copy the necessary data in the right locations.
All of the major requirements have been implemented.
The system functions properly. Exceptions are handled very well.
Users can run the system without any configuration.
Coding convention and quality of code
• The code follows a consistent and well-formatted programming convention
• The code contains sufficient comments
The code is not formatted.
Little or no comments are provided.
The naming of the methods or variables is inconsistent.
There are significant errors in the format of the code and the naming of the methods or variables.
There is a significant lack of useful comments.
The code is generally well written, but there is some room for improvement.
There are more than five errors but less than eight errors in terms of the naming conventions and the format of the code.
The code is generally well written.
There are more than two errors but less than five errors in terms of the naming conventions and the format of the code.
The code is expertly written.
There are no more than two errors in terms of the naming convention and the format of the code.
DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 6 of 7
Percentage for this criterion = 25%
No naming convention is followed.
There is a reasonable amount of useful comments.
There is a sufficient amount of useful comments.
There is a sufficient amount of useful comments.
Accuracy of outcomes
• The code produces the correct results.
• The code behaves the same way, independent of the dataset.
Percentage for this criterion = 25%
The code cannot classify any Twitter texts into sentiments, such as positive, negative or neutral sentiments.
The code cannot classify the Twitter texts for any dataset.
The code can only classify very selective Twitter texts correctly into sentiments, such as positive, negative or neutral sentiments.
The code demonstrates the above-mentioned behaviour for both the datasets.
The code can reasonably classify the Twitter texts correctly into sentiments, such as positive, negative or neutral sentiments.
The code demonstrates the above-mentioned correct behaviour for one dataset but not for the other dataset.
The code can classify 70% of the Twitter texts correctly into sentiments, such as positive, negative or neutral sentiments.
The code demonstrates the above-mentioned correct behaviour for both the datasets.
The code can classify 85% of the Twitter texts correctly into sentiments, such as positive, negative or neutral sentiments.
The code demonstrates the above-mentioned correct behaviour for both the datasets.
Effective written communication
• Writing skills.
• Highlighted the similarities or dissimilarities of the outcomes from two different data sources.
Percentage for this criterion = 25%
Poor writing skills. Additionally, the articulations are not clear at all.
• Lacks overall organisation.
• Very difficult to follow.
• Grammar and spelling errors make it difficult for the reader to interpret the text in many places.
Writing is readable; however, it is difficult to comprehend the information presented.
• Not well organised for the most part.
• Difficult to follow.
• Choice of words needs to be improved.
• Grammatical errors impede the flow of communication.
Writing is readable and it is reasonably easy to comprehend the information presented.
• Organised for the most part.
• Difficult to follow.
• Words are well chosen; however, some minor improvements are needed.
• Sentences are mostly grammatically
Writing is good and it is easy to comprehend the information presented.
• Well organised.
• Cohesive and easy to follow.
• Words are well chosen.
• Sentences are grammatically correct and free of spelling errors.
Writing is excellent, short, sharp and to-the-point and easily digestible.
• Exceptionally organised.
• Highly cohesive and easy to follow.
• Words are carefully chosen that precisely express the intended meaning and support reader comprehension.
DLE602_Assessment 1 Brief_Source Code and Report_Module 4 Page 7 of 7
Failed to highlight the similarities or dissimilarities of the outcomes from the two different data sources.
Attempted to highlight the similarities or dissimilarities of the outcomes from the two different data sources. However, it is not thought provoking or insightful.
correct and contain few spelling errors.
Attempted to highlight the similarities or dissimilarities of the outcomes from the two different data sources. It is reasonably thought provoking or insightful.
Attempted to highlight the similarities or dissimilarities of the outcomes from the two different data sources. It is thought provoking or insightful.
• Sentences are grammatically correct and free of spelling errors.
Attempted to highlight the similarities or dissimilarities of the outcomes from the two different data sources. It is thought provoking, insightful and discovered something very unique from their experiments.
The following Subject Learning Outcomes are addressed in this assessment
SLO a)
Build, train and apply deep learning models to real-world tasks.
SLO b)
Compare and select ways to pre-process signals, images, and texts for natural language, speech recognition, and computer vision applications.


Buy plagiarism free, original and professional custom paper online now at a cheaper price. Submit your order proudly with us



Essay Hope