1676405460
This crawler automates the following step:
# upload pdf to googledrive, store data and notify via email
python script/spider.py -c config/prod.cfg -u googledrive -s firebase -n gmail
# download all format
python script/spider.py --config config/prod.cfg --all
# download only one format: pdf|epub|mobi
python script/spider.py --config config/prod.cfg --type pdf
# download also additional material: source code (if exists) and book cover
python script/spider.py --config config/prod.cfg -t pdf --extras
# equivalent (default is pdf)
python script/spider.py -c config/prod.cfg -e
# download and then upload to Google Drive (given the download url anyone can download it)
python script/spider.py -c config/prod.cfg -t epub --upload googledrive
python script/spider.py --config config/prod.cfg --all --extras --upload googledrive
# download and then upload to OneDrive (given the download url anyone can download it)
python script/spider.py -c config/prod.cfg -t epub --upload onedrive
python script/spider.py --config config/prod.cfg --all --extras --upload onedrive
# download and notify: gmail|ifttt|join|pushover
python script/spider.py -c config/prod.cfg --notify gmail
# only claim book (no downloads):
python script/spider.py -c config/prod.cfg --notify gmail --claimOnly
Before you start you should
python --version
git clone https://github.com/niqdev/packtpub-crawler.git
pip install -r requirements.txt
(see also virtualenv)cp config/prod_example.cfg config/prod.cfg
[credential]
credential.email=PACKTPUB_EMAIL
credential.password=PACKTPUB_PASSWORD
Now you should be able to claim and download your first eBook
python script/spider.py --config config/prod.cfg
From the documentation, Google Drive API requires OAuth2.0 for authentication, so to upload files you should:
config/client_secrets.json
[googledrive]
...
googledrive.client_secrets=config/client_secrets.json
googledrive.gmail=GOOGLE_DRIVE@gmail.com
Now you should be able to upload your eBook to Google Drive
python script/spider.py --config config/prod.cfg --upload googledrive
Only the first time you will be prompted to login in a browser which has javascript enabled (no text-based browser) to generate config/auth_token.json
. You should also copy and paste in the config the FOLDER_ID, otherwise every time a new folder with the same name will be created.
[googledrive]
...
googledrive.default_folder=packtpub
googledrive.upload_folder=FOLDER_ID
Documentation: OAuth, Quickstart, example and permissions
From the documentation, OneDrive API requires OAuth2.0 for authentication, so to upload files you should:
[onedrive]
...
onedrive.client_id=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
onedrive.client_secret=XxXxXxXxXxXxXxXxXxXxXxX
Now you should be able to upload your eBook to OneDrive
python script/spider.py --config config/prod.cfg --upload onedrive
Only the first time you will be prompted to login in a browser which has javascript enabled (no text-based browser) to generate config/session.onedrive.pickle
.
[onedrive]
...
onedrive.folder=packtpub
Documentation: Registration, Python API
To upload your eBook via scp
on a remote server update the configs
[scp]
scp.host=SCP_HOST
scp.user=SCP_USER
scp.password=SCP_PASSWORD
scp.path=SCP_UPLOAD_PATH
Now you should be able to upload your eBook
python script/spider.py --config config/prod.cfg --upload scp
Note:
scp.path
on the remote server must exists in advance--upload scp
is incompatible with --store
and --notify
Create a new Firebase project, copy the database secret from your settings
https://console.firebase.google.com/project/PROJECT_NAME/settings/database
and update the configs
[firebase]
firebase.database_secret=DATABASE_SECRET
firebase.url=https://PROJECT_NAME.firebaseio.com
Now you should be able to store your eBook details on Firebase
python script/spider.py --config config/prod.cfg --upload googledrive --store firebase
To send a notification via email using Gmail you should:
[gmail]
...
gmail.username=EMAIL_USERNAME@gmail.com
gmail.password=EMAIL_PASSWORD
gmail.from=FROM_EMAIL@gmail.com
gmail.to=TO_EMAIL_1@gmail.com,TO_EMAIL_2@gmail.com
Now you should be able to notify your accounts
python script/spider.py --config config/prod.cfg --notify gmail
[ifttt]
ifttt.event_name=packtpub-crawler
ifttt.key=IFTTT_MAKER_KEY
Now you should be able to trigger the applet
python script/spider.py --config config/prod.cfg --notify ifttt
Value mappings:
[join]
join.device_ids=DEVICE_IDS_COMMA_SEPARATED_OR_GROUP_NAME
join.api_key=API_KEY
Now you should be able to trigger the event
python script/spider.py --config config/prod.cfg --notify join
[pushover]
pushover.user_key=PUSHOVER_USER_KEY
pushover.api_key=PUSHOVER_API_KEY
Create a new branch
git checkout -b heroku-scheduler
Update the .gitignore
and commit your changes
# remove config/prod.cfg config/client_secrets.json config/auth_token.json # add dev/ config/dev.cfg config/prod_example.cfg
Create, config and deploy the scheduler
heroku login # create a new app heroku create APP_NAME --region eu # or if you already have an existing app heroku git:remote -a APP_NAME # deploy your app git push -u heroku heroku-scheduler:master heroku ps:scale clock=1 # useful commands heroku ps heroku logs --ps clock.1 heroku logs --tail heroku run bash
Update script/scheduler.py
with your own preferences.
More info about Heroku Scheduler, Clock Processes, Add-on and APScheduler
Build your image
docker build -t niqdev/packtpub-crawler:2.4.0 .
Run manually
docker run \
--rm \
--name my-packtpub-crawler \
niqdev/packtpub-crawler:2.4.0 \
python script/spider.py --config config/prod.cfg
Run scheduled crawler in background
docker run \
--detach \
--name my-packtpub-crawler \
niqdev/packtpub-crawler:2.4.0
# useful commands
docker exec -i -t my-packtpub-crawler bash
docker logs -f my-packtpub-crawler
Alternatively you can pull from Docker Hub this fork
docker pull kuchy/packtpub-crawler
Add this to your crontab to run the job daily at 9 AM:
crontab -e
00 09 * * * cd PATH_TO_PROJECT/packtpub-crawler && /usr/bin/python script/spider.py --config config/prod.cfg >> /tmp/packtpub.log 2>&1
Create two files in /etc/systemd/system:
[Unit]
Description=run packtpub-crawler
[Service]
User=USER_THAT_SHOULD_RUN_THE_SCRIPT
ExecStart=/usr/bin/python2.7 PATH_TO_PROJECT/packtpub-crawler/script/spider.py -c config/prod.cfg
[Install]
WantedBy=multi-user.target
[Unit]
Description=Runs packtpub-crawler every day at 7
[Timer]
OnBootSec=10min
OnActiveSec=1s
OnCalendar=*-*-* 07:00:00
Unit=packtpub_crawler.service
Persistent=true
[Install]
WantedBy=multi-user.target
Enable the script with sudo systemctl enable packtpub_crawler.timer
. You can test the service with sudo systemctl start packtpub_crawler.timer
and see the output with sudo journalctl -u packtpub_crawler.service -f
.
The script downloads also the free ebooks from the weekly packtpub newsletter. The URL is generated by a Google Apps Script which parses all the mails. You can get the code here, if you want to see the actual script, please clone the spreadsheet and go to Tools > Script editor...
.
To use your own source, modify in the config
url.bookFromNewsletter=https://goo.gl/kUciut
The URL should point to a file containing only the URL (no semicolons, HTML, JSON, etc).
You can also clone the spreadsheet to use your own Gmail account. Subscribe to the newsletter (on the bottom of the page) and create a filter to tag your mails accordingly.
Install paramiko with sudo -H pip install paramiko --ignore-installed
Install missing dependencies as described here
# install pip + setuptools
curl https://bootstrap.pypa.io/get-pip.py | python -
# upgrade pip
pip install -U pip
# install virtualenv globally
sudo pip install virtualenv
# create virtualenv
virtualenv env
# activate virtualenv
source env/bin/activate
# verify virtualenv
which python
python --version
# deactivate virtualenv
deactivate
Run a simple static server with
node dev/server.js
and test the crawler with
python script/spider.py --dev --config config/dev.cfg --all
This project is just a Proof of Concept and not intended for any illegal usage. I'm not responsible for any damage or abuse, use it at your own risk.
Author: Niqdev
Source Code: https://github.com/niqdev/packtpub-crawler
License: MIT license
1675283460
The procedure through which the software of a computer makes the directories and files on a storage device (such as a hard disk, CD-ROM, or a network share accessible to users via the computer’s file system) is known as mounting. We will explain how to use Google Drive and mount it in Linux Mint in this article.
A drive or other storage appears to be integrated into the Linux file system through mounting, which is crucial for rights as well as other regulations. Google Drive on Linux Mint works just like it does on both Mac and Windows. Google offers a sizable number of free Cloud storage. New customers will still have access to approximately 15 GB. However, current users will have significantly more storage according to the bundle they choose. Utilizing Google Drive, we can safely save our files and access, view, or modify them from any device.
We will show you how to use gnome online accounts to mount Google Drive on Linux. The steps indicated below must be followed to accomplish this.
Let us open the Gnome Online Accounts application and add our Google account. To do this, click the Start button on the desktop, which is located on the right side of the button. As shown in the figure below, when we click the start button, a menu appears on the screen with a search bar at the top.
We will write “online account” in the search bar to look for it in the menu. When the “online account” option appears on the window screen, we click on that option to open it.
When you click “online account” the window shown in the screenshot will appear on the screen containing the information about Gnome google accounts. You must select “google” from the selections on this screen, which include a range of options.
You will be prompted for your Gmail login information after clicking on Google, as shown in the image below. You must first provide your password and email address there. In the next step, you will be prompted to verify your account, which is an additional step that verifies that you are the one trying to log in.
When the authentication phase is complete, you must check the “Allow” box to show that you will accept Gnome.
The features you want to provide access to will be asked of you after your authentication. The features that are visible on the window screen include “Mail,” “Calendar,” “Contacts,” “Photos,” “Files,” and “Printers”. Here, you can choose whatever you want. Since this article is about mounting Google Drive in Linux, all that is required is that files be enabled. Thus, we merely switch on the “files” option. Make sure the switch next to the file is turned on and then close this dialogue box by clicking the cross button in the top left corner.
As shown in the screenshot below, your Google account is now successfully connected. When you reach this stage, Google Drive is ready for use.
Utilize the Linux GUI to Access Google Drive
This portion will demonstrate how to use the Linux GUI to access Google Drive. The term “Linux GUI” refers to a feature or utility that supports a user interface, enables user interaction with the system, and does it with the assistance of windows, icons, graphics, etc.
Your account is currently active, allowing you to use Google Drive. If you access a file that is formally referenced, you will notice a new disc and allocation menu with your Gmail address and its name. You can copy, delete, and rename files from this Google Drive by using the same steps as we did for the other folders. We can copy a file by right-clicking it and choosing copy to rename it; and to delete it, we choose delete.
In this part, we will explain how to use a Linux command to access Google Drive. First, we must obtain the “UID” for this. Every account on the system has a unique “UID”, which is issued by Linux. The user is identified by this number, which is also used to define which system resources they can access. So, to get “UID,” we can just type “UID” with the “$” sign and the echo command to display the UID on the console. Since this is an integer number with four digits, the id will be displayed on the terminal window when we input the echo command.
omar@omar-VirtualBox:~$ echo $UID
The UID, which is “1000,” is displayed on the terminal. When you run this command on your terminal, you can obtain a different ID depending on the system’s UID.
1000
Then, in the following step, we will use the command to switch to the run user directory by first typing “cd.” Then, for the run user directory, we will type “/run/user.” We will enter our UID, which is “1000,” and then type “gvfs.” Finally, we will type “google-drive,” because we will be using the Linux command to access Google Drive and mention the Google account we use, “omar11,” remembering to add a slash between and at the end.
omar@omar-VirtualBox:~$ cd /run/user/1000/gvfs/google-drive\:host\=gmail.com\,user\=Omar11/
omar@omar-VirtualBox:/run/user/1000/gvfs/google-drive:host=gmail.com,user=omar11$
You can see that the directory is changed when we execute this command.
Now, we are going to display the files currently in Google Drive. To do this, we will use the Linux command “ls” which displays a list, followed by a dash and the “lrt.” As there are presently no files on our drive, the list of files will be displayed on the console.
omar@omar-VirtualBox:/run/user/1000/gvfs/google-drive:host=gmail.com,user=omar11$
ls -lrt
The result of this command indicates that there are no files. However, since we have two directories in our Google Drive instead of the names of the directories, their IDs are displayed.
Total 0
Dr-x------ 1 omar omar 0 Jan 1 1970 GVfsSharedWithMe
Drwx------ 1 omar omar 0 Jan 1 1970 0AI5FhhuutLBuUK9PVA
For displaying the name of the file or directories, we are going to utilize the “gio” command in which we first type “gio,”, type “list,”, use the dash “a,”, and type “standard,”. Then, insert a double colon and type “display-name” within the inverted comma.
omar@omar-VirtualBox:/run/user/1000/gvfs/google-drive:host=gmail.com,user=omar11$
gio list –a “standard: : display-name”
Now that we have performed this command, you can see that the names of the two directories that are now existing in Google Drive have been displayed. The first one is called “My Drive,” and the second is called “Shared with me”. If there are any files on google drive, their names will also be displayed.
0AI5FhhuutLBuUk9PVA 0 (directory) standard: : display-name=My Drive
GVfsSharedWithMe 0 (directory) standard: : display-name=Shared with me
We have covered how to mount and utilize Google Drive on Linux in this article. To mount Google Drive in Linux using the gnome online account, we first connected to the Google Account by selecting the online account option and then entering the Google Account credentials. After the Google Account was verified, we enabled the file option from the dialogue box, which connected the Google Account. Then, in the following section, we described how to access Google Drive using the GUI, and in the last section, we demonstrated how to accomplish this with a Linux command.
Original article source at: https://linuxhint.com/
1658979761
In this tutorial, we will learn how to upload local files to Google Drive with files info store in Excel using Google Drive API.
This is an extremely useful utility tool when you want to copy or move multiple files from one Google Drive account to another or if you want to migrate files from Google Drive to a different Cloud Drive service.
Google.py source code: https://learndataanalysis.org/google-py-file-source-code/
Subscribe: https://www.youtube.com/c/JieJenn/featured
1653467467
In this video, I show you How to Get Text in Image with Google Drive
You can see more at: How To Convert Image To Text Using Google Docs
1651912514
In this video We will show you How To Convert Image To Text Using Google Docs. Yes there is a FREE OCR service inside Google Drive and not many people know about this. I will show you another method that you can use if you don't have Google Drive.
Subscribe: https://www.youtube.com/c/AliMirza/featured
1649198460
MyDrive
MyDrive is an Open Source cloud file storage server (Similar To Google Drive). Host myDrive on your own server or trusted platform and then access myDrive through your web browser. MyDrive uses mongoDB to store file/folder metadata, and supports multiple databases to store the file chunks, such as Amazon S3, the Filesystem, or just MongoDB. MyDrive is built using Node.js, and Typescript. The service now even supports Docker images!
Go to the main myDrive website for more infomation, screenshots, and more.
Required:
Windows users will usually need both the microsoft visual build tools, and python 2. These are required to build the sharp module:
Linux users will need to make sure they have 'build-essential' installed:
sudo apt-get install build-essential
Setup:
Install Node Modules
npm install
Run the build command
npm run build
Create Environment Variables: Easily create enviroment variables with the built in command. This command will start a server where you can type in the enviroment variables through a webUI on your browser.
npm run setup
Rebuild the project after entering enviroment variables
npm run build
(Optional) Create the MongoDB indexes, this increases performance. MongoDB must be running for this command to work.
npm run create-indexes-database
Start the server
npm run start
MyDrive will first host a server on http://localhost:3000 in order to safely get the encryption key, just navigate to this URL in a browser, and enter the encryption key. You can access this URL through your IP address also, but localhost is recommended to avoid man in the middle attacks.
If you're using a service like SSH or a Droplet, you can forward the localhost connection safely like so:
ssh -L localhost:3000:localhost:3000 username@ip_address
Note: You can also disable using the webUI for the encryption key by providing a key in the server environment variables (e.g. KEY=password), but this is not recommended because it greatly reduces security.
MyDrive has built in Docker support, there are two options when using Docker, users can either use the Docker image that has MongoDB built in, or use the Docker image that just has the MyDrive image (If you're using a service like Atlas).
Create the Docker environment variables by running the 'npm run setup' command as seen in the installation section. Or by manually creating the file (e.g. docker-variables.env on the root of the project, see the environment section for more infomation).
Docker with mongoDB image:
docker-compose build
Docker without mongoDB image:
docker-compose -f docker-compose-no-mongo.yml build
Start the Docker Image:
docker-compose up
If you are running a previous version of myDrive such as myDrive 2 you must perform the following steps before you will be able to run myDrive properly. An easy way to tell if you are running a previous version of myDrive is checking if you have the old UI/look of myDrive 2. If your home page looks different than the myDrive 3 design you are most likely running myDrive 2.
First I recommend creating a new folder for myDrive 3, so just incase you are having difficulties with myDrive 3 you can easily revert to myDrive 2.
After you install the node modules, run setup, and build the project. You can then run the script to clear all the authentication tokens from all the users. This is because the Schema for tokens has changed, this will cause all users to have to log back in.
Run the following command:
Remove old tokens
npm run remove-tokens
If successful you should see in the terminal the number of users that has their tokens removed, if you run into any errors check your environment variables and make sure the project is built properly.
Modern and colorful design
Upload Files
Download Files
Image Viewer
Video Viewer
Image Thumbnails
Share Files
Search For Files/Folders
Move File/Folders
Google Drive Support
You can easily create environment variables using the built in setup tool 'npm run setup', or manually create the files.
Create a config folder on the root of the project, and create a file with the name prod.env for the server. For the client variables create a .env.production file in the root of the project.
Docker: If you're using Docker, instead create a file named 'docker-variables.env' on the root of the project. You must also include DOCKER=true in the servers environment variables.
Server Environment Variables:
Client Environment Variables
For a more detailed setup guide please visit the main myDrive site: https://mydrive-storage.com/
The wiki includes guided setup: https://github.com/subnub/myDrive/wiki
I created a short YouTube video, showing off myDrives design and features: https://www.youtube.com/watch?v=_bcADP6hDDI&feature=youtu.be
Demo: https://mydrive-3.herokuapp.com/
Patreon: https://www.patreon.com/subnub
Contact Email: kyle.hoell@gmail.com
Author: Subnub
Source Code: https://github.com/subnub/myDrive
License: GPL-3.0 License
1646048891
Easily sell PDF ebooks, MP3 audio, software licenses, graphic files, and other digitally downloadable products with PayPal and Google Sheets.
Customers can pay inline without leaving your website. The files are hosted on Google Drive, the PDF invoices are generated with Google Sheets and the order with the files are delivered via Gmail.
1646027160
Receive large files from clients, students or anyone in your Google Drive with File Upload Forms. Your Forms can also include e-signatures, CAPTCHA, you can protect file upload forms with a password. The responses are saved in a Google Sheet. You can also send instant email notifications to the form respondent with a copy of the form response.
1636624200
In this tutorial, I will be covering how to use Drive API to access your Google Drive Storage Information in Python.
For G Suite accounts, Google Drive doesn’t show the actual storage number anymore. I have to go to the Drive storage summary to view the information. I also wish Google Drive would display the storage the trash folder uses. Fortunately, that information is available with Google Drive API.
📑 Google.py source code: https://learndataanalysis.org/google-py-file-source-code/
1603995240
Receive large files from clients, students or anyone in your Google Drive with File Upload Forms.
Your Forms can also include e-signatures, CAPTCHA, you can protect file upload forms with a password. The responses are saved in a Google Sheet. You can also send instant email notifications to the form respondent with a copy of the form response.
File Upload Form Demo - https://forms.studio/demo/
File Upload Form Google Sheet - https://forms.studio/copy/
Subscribe : https://www.youtube.com/channel/UC4gON4eBraggD3DiO3ttlIg
#google #googledrive
1594206180
Node.js Express Google Drive API Upload File Using Google Apis Client Library Full Project
Download the source code of application here
#googledrive #nodejsproject #nodejs #express
1594194900
Node.js Express Google Drive API Upload File Using Google Apis Client Library Full Project
Download the source code of application here:
https://codingshiksha.com/javascript/node-js-express-google-drive-api-upload-file-using-google-apis-client-library-full-project/
#googledrive #nodejsproject #nodejs #express