Skip to content
This repository was archived by the owner on May 25, 2022. It is now read-only.

Commit 14a370d

Browse files
committed
added files
1 parent 7c9b2b0 commit 14a370d

File tree

4 files changed

+115
-0
lines changed

4 files changed

+115
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Scraping Medium Articles
2+
3+
Well [Medium](https://medium.com/) is a website containing great articles and used by many programmers.
4+
5+
This script asks the user for the url of a medium article, scrapes it's text and saves it to a text file in the same directory.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
beautifulsoup4==4.9.1
2+
requests==2.23.0
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
url: https://medium.com/4thought-studios/the-pros-and-cons-of-open-source-software-d498304f2a95
2+
3+
Title: THE PROS AND CONS OF OPEN SOURCE SOFTWARE
4+
by Khalil Khalaf
5+
6+
INTRODUCTION
7+
8+
The Pros and Cons of Open Source Software
9+
Is open source software right for your business?
10+
Khalil KhalafFollowJul 11, 2017 · 6 min read
11+
12+
The term “open source” refers to products designed to be publicly accessible for people to use, modify and share. Open source software is software that anyone can access, inspect and enhance the source code that most users don’t ever see in normal circumstances. A source code is a list of text commands that is written by computer programmers, to be compiled or assembled into an executable computer program.
13+
You might have heard of open source software and may have been encouraged to give it a try. After all, why pay for Autocad when you can use Qcad to create blueprints for your building, computer chips and car parts? Why pay for Photoshop when you can use Gimp to edit and enhance your images? Why pay for Microsoft Office while you can use LibreOffice to write, calculate and do excellent presentations?
14+
There are many legitimate advantages to using open source software. However, there are downsides to using them, especially from the standpoint of day to day business life and development. Before committing to open source software, you should consider the following advantages and disadvantages.
15+
16+
ADVANTAGES:
17+
18+
Free and/or Cheaper than Commercial Products.
19+
Open source software comes with a great advantage since it can be installed for free. Furthermore, it can be used and deployed again and again on multiple machines without the need of tracking the license compliance and terms of use. For example, according to Kate Rockwood, “…Instead of sinking 375 days — and $500,000 — into developing a proprietary code, Pendo went in the completely opposite direction: It down­loaded an open-source software engine coded entirely by volunteer[s].”
20+
Open source software help companies save the time and money by providing ready to use software as a whole. This software could be plugins (features to be added to existing software), Front ends and interfaces that are easy to integrate, or Back ends and easy to use engines. This might sound unbelievable, but open source programs are developed with the intention to be available to anyone, even those who can’t afford commercial software. Furthermore, many of these programs are created to work with almost any type of platform, which helps extend your hardware life and avoids the need to constantly replace them.
21+
In the Software Development Life Cycle, there are three stages that are often underestimated by project managers: Testing, Debugging and Integration. If you are a software development company, you likely know now — after disappointing your clients — that these three stages consume almost the same time as time dedicated to other stages of the software project. Open source software is good at cutting down on the development and reduces the pain and time of development planning and stages.
22+
Highly Reliable.
23+
Open source software is usually developed by a group of talented and skillful experts. Sometimes, they are developed by tens or hundreds of volunteers that simply love what they do for the community. Hence why most of the open source software are high-quality programs. Also, since anyone can access the code and fix a bug, you will notice continuous improvement and new versions or features added to the software every now and then. This improvement and the code itself will always exist even if it was originally developed by a current dissolved company.
24+
Also, you should know that any open source software can be customized and tweaked by you, which can help your company match the software with your business’s needs. You literally can do whatever you want with it and you aren’t locked into packages that are only compatible with each other. This can be especially helpful if you are a software development company. For example, if a client asked for a software with 10 features, you can download an open source with 5 features already done, and add in the missing 5 features. Or maybe an open source program with 10 features that do not match your client’s requirement but then you modify them for the perfect match. Or even an open source software with 15 features and simply remove or hide the additional 5 features. The point is, with this level of customization you can guarantee that the software could be reliable since it can be tweaked specifically by you.
25+
26+
DISADVANTAGES:
27+
28+
Not as User-Friendly as Commercial Software
29+
30+
This cannot be generalized for all open source software. For example LibreOffice, Mozilla Firefox and Android OS are amazingly easy to use. However, while there are several open source software that solve large problems super fast, complicated computation or big data, but sometimes not much attention is given to its GUI (Graphical User Interface). This can make the software annoying to work with especially for nontechnical users. Nontechnical companies may need to dedicate some time to train their team and get them up to speed for every new release of these open source programs. As for technical companies, especially software development companies, they may need to build a proper GUI and integrate it with the back end which may require as much time and money as rewriting the whole software.
31+
Lack of extensive tech support
32+
User communities are out there and can be very responsive, but you really can’t count on the community one hundred percent of the time since it is not their job. No one is getting paid for fixing your bugs, provide you or your team the proper training, or respond to your questions and requirements. If your client or employee is suffering from a bug, you are literally on your own. The best thing to do might be to just wait for somebody in the community to face the same issue and hopefully fix it. The other option would be to hire an expert dedicated to maintaining and improving the software.
33+
Most of the times, you will also need to get your team up to speed. This is because of the constant development and in parallel between several community developers of open source software. Due to this, there is often confusion among the team since they are uncertain which version does what and if its compatible with other software and platforms. Hence where additional cost comes with every open source software.
34+
35+
FINAL THOUGHTS:
36+
As a software developer myself, I provided the community with all the software that I wrote for personal projects. I have used open source software for personal needs and within software development jobs. I recommend using open source software since it saved me and my employer a lot of time and money, and made my clients happier. Also, looking at other developer’s codes and algorithms improved my skills in reusing codes that were written by someone else, and my experiences when reviewing others’ algorithms and logic.
37+
As a result of all the benefits, I always contribute in online communities helping other developers and users. I consider doing that as giving back to the community. I would not have been a developer without the help of other developers who did to me in the past exactly what I am doing to others now. There is absolutely no passion for me to abandon a project and never reply to users, and I can say I never had problems in using an open source software; specifically to mention ones that got abounded. We developers love improving and fixing things. We simply love what we do.
38+
39+
40+
FURTHER READING:
41+
Using Open-Source Code Can Save You Half a Million Dollars — but Do It CarefullyNine thousand hours. That’s how much time financial tech firm Pendo Systems estimates it would take to write the code…www.inc.com
42+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
import os
2+
import sys
3+
import requests
4+
import re
5+
from bs4 import BeautifulSoup
6+
7+
# switching to current running python files directory
8+
os.chdir('\\'.join(__file__.split('/')[:-1]))
9+
10+
# function to get the html of the page
11+
def get_page():
12+
url = input('Enter url of a medium article: ')
13+
# handling possible error
14+
if (url[:18] != 'https://medium.com') and (url[:17] != 'http://medium.com'):
15+
print('Please enter a valid website, or make sure it is a medium article')
16+
sys.exit(1)
17+
res = requests.get(url)
18+
res.raise_for_status()
19+
soup = BeautifulSoup(res.text, 'html.parser')
20+
return url, soup
21+
22+
# function to remove all the html tags and replace some with specific strings
23+
def purify(text):
24+
rep = {"<br>": "\n", "<br/>": "\n", "<li>": "\n"}
25+
rep = dict((re.escape(k), v) for k, v in rep.items())
26+
pattern = re.compile("|".join(rep.keys()))
27+
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)
28+
text = re.sub('\<(.*?)\>', '', text)
29+
return text
30+
31+
# function to compile all of the scraped text in one string
32+
def collect_text(url, soup):
33+
fin = f'url: {url}\n\n'
34+
main = (soup.head.title.text).split('|')
35+
fin += f'Title: {main[0].strip().upper()}\n{main[1].strip()}'
36+
37+
header = soup.find_all('h1')
38+
j = 1
39+
40+
try:
41+
fin += '\n\nINTRODUCTION\n'
42+
for elem in list(header[j].previous_siblings)[::-1]:
43+
fin += f'\n{purify(str(elem))}'
44+
except:
45+
pass
46+
47+
fin += f'\n\n{header[j].text.upper()}'
48+
for elem in header[j].next_siblings:
49+
if elem.name == 'h1':
50+
j+=1
51+
fin += f'\n\n{header[j].text.upper()}'
52+
continue
53+
fin += f'\n{purify(str(elem))}'
54+
return fin
55+
56+
# function to save file in the current directory
57+
def save_file(fin):
58+
with open('scraped_article.txt', 'w', encoding='utf8') as outfile:
59+
outfile.write(fin)
60+
print('File saved in current directory as scraped_article.txt')
61+
62+
# driver code
63+
if __name__ == '__main__':
64+
url, soup = get_page()
65+
fin = collect_text(url, soup)
66+
save_file(fin)

0 commit comments

Comments
 (0)