Friday, May 4, 2018

Work with Solr

Terminology

1) Solr instance:Zero or more cores can be configured to run inside a Solr instance. Each Solr instance requires a reference to a separate Solr home directory.

2) Solr core: Each of your indexes and the files required by that index makes a core. So if your application requires multiple indexes, you can have multiple cores running inside a Solr instance

3) Solr home:directory that Solr refers to for almost everything. It contains all the information regarding the cores and their indexes, configurations, and dependencies.

4) Solr shard: This term is used in distributed environments, in which you partition the data between multiple Solr instances. Each chunk of data on a particular instance is called a shard. The shard contains a subset of the whole index. For example, say you have 30 million documents and plan to distribute these in three shards, each containing 10 million documents. You’ll need three Solr instances, each having one core with the same name and schema. While serving queries, any shard can receive the request and distribute it to the other two shards for processing, get all the results and respond back to the client with the merged result.


Solr types of distributed architecture:

1) master-slave architecture [old]: index is created on the master server, which is replicated to one or more slave servers dedicated to searching. This approach has several limitations

2) SolrCloud [new]: sets up a cluster of Solr servers to provide fault tolerance and high availability and to offer features such as distributed indexing, centralized configuration, automatic load balancing, and failover.


SolrCloud Terminology

Node: A single instance of Solr

Cluster: All the nodes in your environment together.

Collection: A complete logical index in a cluster.

Shard: A logical portion, or slice, of a collection.

Replica: The physical copy of a shard, which runs in a node as a Solr core.

Leader: Among all the replicas of a shard, one is elected as a leader. SolrCloud forwards all requests
to the leader of the shard, which distributes it to the replicas.

ZooKeeper: ZooKeeper is an Apache project widely used by distributed systems for centralized configuration and coordination. SolrCloud uses it for managing the cluster and electing a leader.


SolrCloud

Apache Solr includes the ability to set up a cluster of Solr servers that combines fault tolerance and high availability. Called SolrCloud, these capabilities provide distributed indexing and search capabilities, supporting the following features:
  • Central configuration for the entire cluster
  • Automatic load balancing and fail-over for queries
  • ZooKeeper integration for cluster coordination and configuration.
SolrCloud is flexible distributed search and indexing, without a master node to allocate nodes, shards and replicas. Instead, Solr uses ZooKeeper to manage these locations, depending on configuration files and schemas. Queries and updates can be sent to any server. Solr will use the information in the ZooKeeper database to figure out which servers need to handle the request.

Launch a SolrCloud cluster on your local workstation

bin/solr start -e cloud

(run the previous command on solr home,to get the next questions, just accept defaults by press enter)

How many Solr nodes would you like to run in your local cluster (specify 1-4 nodes) [2]:?
Please enter the port for node1 [8983]:
Please enter the port for node2 [7574]:
Create a new collection, Please provide a name for your new collection [gettingstarted]:
How many shards would you like to split new collection into? [2]
How many replicas per shard would you like to create? [2]




Notice that 
Two instances of Solr have started on two nodes, one on port 7574 and one on port 8983.
There is one collection created, a two shard collection, each with two replicas.
Solr Admin UI URL: http://localhost:8983/solr

Solr has two main configuration files: the schema file (named either managed-schema or schema.xml), and solrconfig.xml.


solrconfig.xml:
configure the <slowQueryThresholdMillis> element in the query section 
<slowQueryThresholdMillis>1000</slowQueryThresholdMillis>
Any queries that take longer than the specified threshold will be logged as "slow" queries at the WARN level.






delete Solr Record

<delete><query>*:*</query></delete>

Thursday, May 3, 2018

Ubuntu important commands

lastb only shows login failures. Use last to see successful logins.



Check if package is install or not

dpkg --list | grep phpmyadmin




How to keep processes running after ending ssh session?
  • ssh into your remote box. Type screen Then start the process you want.
  • Press Ctrl-A then Ctrl-D. This will "detach" your screen session but leave your processes running. You can now log out of the remote box.
  • If you want to come back later, log on again and type screen -r This will "resume" your screen session, and you can see the output of your process.
OR use 
sudo aptitude install byobu
Start byobu by typing byobu.
Press 
F2 to create a new window within the current session, 
F3-F4 to switch between the various windows.
F6 (detach) to leave byobu and keeep it running .




Get Installation directory 
whereis tomcat7

Compress Folder and save it as a file in current directory
tar -zcvf FileName.tar.gz -C /Folder/Name/To/Compress   .

Extract Compressed File
tar zxf solr-7.0.0.tgz


Install PHP version 5.6
------------------------------
sudo apt-get install python-software-properties
sudo add-apt-repository ppa:ondrej/php
sudo apt-get update
sudo apt-get install -y php5.6

Un-Install PHP 7
-----------------------
sudo apt-get purge php7.0-common
sudo apt-get purge php7.*


Install JDK 8.0
-------------------
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default

#Setup JAVA_HOME and JRE_HOME Variable
nano /etc/environment
JAVA_HOME=/usr/lib/jvm/java-8-oracle
JRE_HOME=/usr/lib/jvm/java-8-oracle/jre


Install Apache2
---------------------
sudo apt-get update && sudo apt-get upgrade
sudo apt-get install apache2 apache2-doc apache2-utils


Install MySQL
-------------------
sudo apt-get update
sudo apt-get install mysql-client-core-5.5
sudo apt-get install mysql-server-5.5
sudo mysql_secure_installation

Un-Install MySQL
----------------------
sudo apt-get remove --purge mysql-server mysql-client mysql-common
sudo apt-get autoremove
sudo apt-get autoclean
sudo rm -rf /var/lib/mysql
sudo rm -rf /etc/init.d/mysql
sudo rm -rf /etc/init/mysql.conf




Login to Mysql, create DB, Import Backup, Create user, and add PRIVILEGES
-------------------------------------------------------------------------------------------------------
mysql -u root -p


CREATE ___my_database_name____ CHARACTER SET utf8 COLLATE utf8_general_ci;

mysql -u root -p ddl < /root/backupFile.sql
CREATE USER 'xxxx'@'localhost' IDENTIFIED BY 'Password';
GRANT ALL ON DatabaseName.* TO 'xxx'@'localhost';

CREATE USER 'xxxx'@'127.0.0.1' IDENTIFIED BY 'Password';
GRANT ALL ON DatabaseName.* TO 'xxx'@'127.0.0.1';

CREATE USER 'root'@'%' IDENTIFIED BY 'some_pass';
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%';

FLUSH PRIVILEGES;




Install Solr and upgrade old Solr version

Step 1. First, Install Java.
Because solr are Java based softwares we need the Java environment (As it is advised in the Solr wiki : prefere a full JDK to a simple JRE.)
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default
Step 3 – Setup JAVA_HOME and JRE_HOME Variable
nano /etc/environment
add two new lines

JAVA_HOME=/usr/lib/jvm/java-8-oracle
JRE_HOME=/usr/lib/jvm/java-8-oracle/jre

reboot then

to validate run  echo $JAVA_HOME





Install Default Solr version 


sudo add-apt-repository universe

Option 1: Install Solr with Tomcat
sudo apt-get install solr-tomcat
Option 2: Install Solr with Jetty
sudo apt-get install solr-jetty
Open the URL http://localhost:8080/solr/admin/, if your tomcat is listen on port 8080.

Notes 
if you get error during installation
tomcat7[4983]:  * no JDK or JRE found - please set JAVA_HOME

then do the next steps

nano  /etc/default/tomcat7
add this line
JAVA_HOME=/usr/lib/jvm/java-8-oracle

change default tomcat port using the next command
nano /etc/tomcat7/server.xml
  • Search "Connector port" and replace 8080 with any new port

Uninstall
sudo apt-get remove solr-tomcat
sudo apt-get remove solr-jetty 
sudo apt-get remove tomcat7-common
sudo apt autoremove




Install Specific version of Solr

1) Install Java
2) Get the URL from       http://archive.apache.org/dist/lucene/solr/
3) Run the next commands

For Solr version 7.3.0

steps: download the package then extract one file from compressed package then call this file and pass compressed folder as prameter

cd ~
wget http://www-eu.apache.org/dist/lucene/solr/7.3.0/solr-7.3.0.tgz
tar xzf solr-7.3.0.tgz solr-7.3.0/bin/install_solr_service.sh --strip-components=2
sudo bash ./install_solr_service.sh  solr-7.3.0.tgz

For Solr version 5.3.1

cd ~
wget http://archive.apache.org/dist/lucene/solr/5.3.1/solr-5.3.1.tgz
tar xzf solr-5.3.1.tgz solr-5.3.1/bin/install_solr_service.sh --strip-components=2
sudo chmod +x install_solr_service.sh
sudo ./install_solr_service.sh solr-5.3.1.tgz

OR


For Solr version 4.10.4

wget https://archive.apache.org/dist/lucene/solr/4.10.4/solr-4.10.4.tgz
tar -xvf solr-4.10.4.tgz
cp -R solr-4.10.4/example /opt/solr
cd /opt/solr
java -jar start.jar

Solr will be active in the next URL   Open the URL http://your_server_ip:8983/solr


Uninstall via
sudo service solr stop
sudo rm -r /var/solr
sudo rm -r /opt/solr-5.3.1
sudo rm -r /opt/solr
sudo rm /etc/init.d/solr
sudo deluser --remove-home solr
sudo deluser --group solr

Use following commands to Start, Stop and check the status of Solr service.
sudo service solr stop
sudo service solr start
sudo service solr status

Create First Solr Collection
sudo su - solr -c "/opt/solr/bin/solr create -c TestCollection1 -n data_driven_schema_configs"
  
this will create new folder with name TestCollection1  in the path /var/solr/data


For more information check https://www.howtoforge.com/tutorial/how-to-install-and-configure-solr-on-ubuntu-1604/




Upgrade old Solr version

use the next script     https://github.com/cominvent/solr-tools/tree/master/upgradeindex

Usage:

Script to Upgrade old indices from 3.x -> 4.x -> 5.x -> 6.x format, 
so it can be used with Solr 6.x or 7.x
Usage: ./upgradeindex.sh [-s] [-t target-ver] <indexdata-root>

Example: ./upgradeindex.sh -s -t 6 /opt/solr

Solr upgradeindex
https://github.com/cradules/bash_scripts/tree/master/solr-tools/upgradeindex
https://github.com/cominvent/solr-tools/tree/master/upgradeindex


Importing/Indexing database (MySQL or SQL Server) in Solr using Data Import Handler
https://gist.github.com/maxivak/3e3ee1fca32f3949f052