71
Russian Board / Приветствую, камрады
« Last post by serger on January 16, 2021, 04:42:21 PM »Chto-to poshlo ne tak - kirillica v tele posta ne postitsya. Nadeyus', Eric naydet prostoe reshenie.
test line:
????
test line:
????
I am writing this in english but xenoscepter has a good point.
Google translate isn't the meme it used to be and we can understand the italian quite well because of it.
The optimization of google translator nowadays is quite good, however, I know how my friend feels as I have been there. 10 years ago I used to live in Roma e adesso mi sono trasferito in Nuova Zelanda da 8 anni. Questo mi ha aiutato molto a venire a capo di molte cose.
Giallu, se te serve fa un fischio. Non sono tantissime le cose che devi imparare in inglese in quanto la maggior parte sono numeri e regole.
Ovviamente se devi seguire discorsi o chat diventa tropo complicato.
Se ci sono cose in particolare che ti servono spara la domanda qui, cosi rimane anche ai posteri.
Frog...Come stai vecchio marpione scampato alle Regine madri?..
I am writing this in english but xenoscepter has a good point.
Google translate isn't the meme it used to be and we can understand the italian quite well because of it.
The optimization of google translator nowadays is quite good, however, I know how my friend feels as I have been there. 10 years ago I used to live in Roma e adesso mi sono trasferito in Nuova Zelanda da 8 anni. Questo mi ha aiutato molto a venire a capo di molte cose.
Giallu, se te serve fa un fischio. Non sono tantissime le cose che devi imparare in inglese in quanto la maggior parte sono numeri e regole.
Ovviamente se devi seguire discorsi o chat diventa tropo complicato.
Se ci sono cose in particolare che ti servono spara la domanda qui, cosi rimane anche ai posteri.
#For the Aurora 4X forum
import requests
from bs4 import BeautifulSoup as bs
#The two libraries.
def find_text(webpage):
"""Takes the webpage as input, output is a list containing all the text as strings"""
step1 = webpage.find('body') #Finding the forum posts.
step2 = step1.find('div', attrs = {'id': 'forumposts'}) #This forum uses alternating background formats.
step3 = step2.find_all('div', attrs = {'class': 'windowbg'}) #windowbg2 is slightly darker.
step3_b = step2.find_all('div', attrs = {'class': 'windowbg2'}) #You need to grab them separately.
step4 = []
step4_b = []
for i in range(len(step3)): #Finding the text from the larger forumpost and putting it in another list.
var = step3[i].find('div', attrs = {'class': 'inner'})
var_t = var.text
step4.append(var_t)
for i in range(len(step3_b)): #Doing the same with the dark background posts.
var = step3_b[i].find('div', attrs = {'class': 'inner'})
var_t = var.text
step4_b.append(var_t)
if len(step4) > len(step4_b): #I am too lazy to work out a better way to make sure the posts are merged in order.
step4_b.append('delete this later') #So if there is not an equal number of windowbg and windowbg2 posts,
length = len(step4) + len(step4_b) #a fake one will be added, it will be removed later.
final_step = []
for i in range(int(length/2)): #Merging all the text to a final list in order with seperating lines.
final_step.append(step4[i])
final_step.append('\n')
final_step.append(step4_b[i])
final_step.append('\n')
if final_step[-1] == 'delete this later': #This is easier for me to work out, but probably terrible.
final_step.pop(-1)
elif final_step[-2] == 'delete this later':
final_step.pop(-2)
elif final_step[-3] == 'delete this later':
final_step.pop(-3)
output = formatting_func(final_step) #Finalise the formatting so the text is not all on one line.
#for i in range(len(final_step)): #Uncomment this out if you want to print the text with your IDE.
# print(final_step[i], '\n')
return output
def formatting_func(final_step):
"""There was an issue with all the text being on one line, which was confusing in Google translate."""
output = []
max_length = 151
for line in final_step:
old_index = 0
if len(line) > 1: #Empty lines are length 1.
loops = len(line)//max_length #Set the number of lines per post.
else:
loops = 0
if loops == 0:
output.append(line)
else:
for j in range(loops+1):
if len(line) > (max_length*(j+1)):
for i in range(20): #If you try to make a new line, you might cut a word in half, which messes up
if line[((max_length*(j+1))-i)] == ' ': #Google translate. This goes backwards and finds a
index = ((max_length*(j+1))-i) #suitable space to make the new line at.
break
output.append(line[old_index:index])
old_index = index
else:
output.append(line[old_index:])
return output #Returns the final formated list.
def download_text(text_file, file_path, document_list):
"""Writes the text onto your computer as a new file."""
for line in text_file:
document_list.append(f'{line}\n') #Extra newline.
with open(file_path, 'a+') as handler: #Writing to file_path
for lines in document_list:
handler.write(lines)
url = '' #Needs to be the full url, example: http://aurora2.pentarch.org/index.php?topic=11579.0
file_path = '/users/username/desktop/filename.txt'
#file_path is the place you want to save your text file and what you want to call it.
#Make sure it is formatted correctly!!
r = requests.get(url)
webpage = bs(r.content)
document_list = []
text_file = find_text(webpage) #Grabs the text.
download_text(text_file, file_path, document_list) #Writes the file.
[/spoiler]It should be fairly easy to write a script to scrape all the text from a forum page and write it to a text file. I know google translate has a feature to translate an entire text file, so you could just scrape any guides or the change list page by page. I wrote something like that in Python to translate something from Finnish to see how it worked. I used Python 3, the requests library and the BeautifulSoup library.
Dovrebbe essere abbastanza facile scrivere uno script per raschiare tutto il testo da una pagina del forum e scriverlo in un file di testo. So che google translate ha una funzione per tradurre un intero file di testo, quindi potresti semplicemente raschiare qualsiasi guida o l'elenco delle modifiche pagina per pagina. Ho scritto qualcosa del genere in Python per tradurre qualcosa dal finlandese e vedere come funzionava. Ho usato Python 3, la libreria delle richieste e la libreria BeautifulSoup.
I am writing this in english but xenoscepter has a good point.
Google translate isn't the meme it used to be and we can understand the italian quite well because of it.